/ Torch

Torch7 - Reading CSV into tensor

Loading content from CSV files in Torch is not as easy as it should be (at least for Lua beginner). I started with csvigo module and wanted to load data, firstly, into table and then move it to tensor. It worked, but only for a test set... data destinated to train model were too big. Yes, csvigo wasn't able to work with 350MB file and returned nice error: not enough memory. According to thread in Google Groups rebuilding torch with different flags could help.

So I decided to use lower level mechanism - reading CSV as a normal file - line by line. Core part of code looked like:

-- Read data from CSV to tensor
local csvFile = io.open(filePath, 'r')
local header = csvFile:read()

local data = torch.Tensor(ROWS, COLS)

local i = 0
for line in csvFile:lines('*l') do
  i = i + 1
  local l = line:split(',')
  for key, val in ipairs(l) do
    data[i][key] = val
  end
end

csvFile:close()

If you are looking for a whole script, check GitHub repository.