-
Notifications
You must be signed in to change notification settings - Fork 144
Open
Description
I'm using limit
and skipto
to parse part of a larger file. I get correct results for small files, but for a large file, the read starts one line later than it should:
big_file_kwargs = Dict(:skipto => 15, :limit => 2000)
small_file_kwargs = Dict(:skipto => 21, :limit => 5)
common_kwargs = Dict(:header => false, :ntasks => 1, :delim => '\t')
small_filepath = "case5.m"
big_filepath = "ACTIVSg2000.m"
using CSV
using DataFrames
# correct
df1 = DataFrame(CSV.File(small_filepath;
common_kwargs...,
small_file_kwargs...))
display(df1[1, :])
for (i, line) in enumerate(eachline(small_filepath))
if i == small_file_kwargs[:skipto]
println("first row of dataframe should be: $line")
end
end
# incorrect: starts at line 16, not line 15.
df2 = DataFrame(CSV.File(big_filepath;
common_kwargs...,
big_file_kwargs...))
display(df2[1, :])
for (i, line) in enumerate(eachline(big_filepath))
if i == big_file_kwargs[:skipto]
println("first row of dataframe should be: $line")
end
end
The inputs I'm using can be found here.
Metadata
Metadata
Assignees
Labels
No labels