Torch7 - Reading CSV into tensor
Loading content from CSV files in Torch is not as easy as it should be (at least for Lua beginner). I started with csvigo
module and wanted to load data, firstly, into table and then move it to tensor. It worked, but only for a test set... data destinated to train model were too big. Yes, csvigo wasn't able to work with 350MB file and returned nice error: not enough memory. According to thread in Google Groups rebuilding torch with different flags could help.
So I decided to use lower level mechanism - reading CSV as a normal file - line by line. Core part of code looked like:
-- Read data from CSV to tensor
local csvFile = io.open(filePath, 'r')
local header = csvFile:read()
local data = torch.Tensor(ROWS, COLS)
local i = 0
for line in csvFile:lines('*l') do
i = i + 1
local l = line:split(',')
for key, val in ipairs(l) do
data[i][key] = val
end
end
csvFile:close()
If you are looking for a whole script, check GitHub repository.