티스토리 뷰
Introduction to DataFrames¶
Bogumił Kamiński, Apr 21, 2018
Reference¶
Series¶
- https://deepstat.tistory.com/69 (01. constructors)(in English)
- https://deepstat.tistory.com/70 (01. constructors)(한글)
- https://deepstat.tistory.com/71 (02. basicinfo)(in English)
- https://deepstat.tistory.com/72 (02. basicinfo)(한글)
- https://deepstat.tistory.com/73 (03. missingvalues)(in English)
- https://deepstat.tistory.com/74 (03. missingvalues)(한글)
- https://deepstat.tistory.com/75 (04. loadsave)(in English)
- https://deepstat.tistory.com/76 (04. loadsave)(한글)
- https://deepstat.tistory.com/77 (05. columns)(in English)
- https://deepstat.tistory.com/78 (05. columns)(한글)
- https://deepstat.tistory.com/79 (06. rows)(in English)
- https://deepstat.tistory.com/80 (06. rows)(한글)
- https://deepstat.tistory.com/81 (07. factors)(in English)
- https://deepstat.tistory.com/82 (07. factors)(한글)
- https://deepstat.tistory.com/83 (08. joins)(in English)
- https://deepstat.tistory.com/84 (08. joins)(한글)
- https://deepstat.tistory.com/85 (09. reshaping)(in English)
- https://deepstat.tistory.com/86 (09. reshaping)(한글)
- https://deepstat.tistory.com/87 (10. transforms)(in English)
- https://deepstat.tistory.com/88 (10. transforms)(한글)
- https://deepstat.tistory.com/89 (11. performance)(in English)
- https://deepstat.tistory.com/90 (11. performance)(한글)
- https://deepstat.tistory.com/91 (12. pitfalls)(in English)
- https://deepstat.tistory.com/92 (12. pitfalls)(한글)
In [1]:
using DataFrames
Possible pitfalls¶
Know what is copied when creating a DataFrame
¶
In [2]:
x = DataFrame(rand(3, 5))
Out[2]:
In [3]:
y = DataFrame(x)
x === y # no copyinng performed
Out[3]:
In [4]:
y = copy(x)
x === y # not the same object
Out[4]:
In [5]:
all(x[i] === y[i] for i in ncol(x)) # but the columns are the same
Out[5]:
In [6]:
x = 1:3; y = [1, 2, 3]; df = DataFrame(x=x,y=y) # the same when creating arrays or assigning columns, except ranges
Out[6]:
In [7]:
y === df[:y] # the same object
Out[7]:
In [8]:
typeof(x), typeof(df[:x]) # range is converted to a vector
Out[8]:
Do not modify the parent of GroupedDataFrame
¶
In [9]:
x = DataFrame(id=repeat([1,2], outer=3), x=1:6)
g = groupby(x, :id)
Out[9]:
In [10]:
x[1:3, 1]=[2,2,2]
g # well - it is wrong now, g is only a view
Out[10]:
Remember that you can filter columns of a DataFrame
using booleans¶
In [11]:
using Random
Random.seed!(1)
x = DataFrame(rand(5, 5))
Out[11]:
In [12]:
x[x[:x1] .< 0.25] # well - we have filtered columns not rows by accident as you can select columns using booleans
Out[12]:
In [13]:
x[x[:x1] .< 0.25, :] # probably this is what we wanted
Out[13]:
Column selection for DataFrame creates aliases unless explicitly copied¶
In [14]:
x = DataFrame(a=1:3)
x[:b] = x[1] # alias
x[:c] = x[:, 1] # also alias
x[:d] = x[1][:] # copy
x[:e] = copy(x[1]) # explicit copy
display(x)
x[1,1] = 100
display(x)
'Flux in Julia > Learning Julia (Intro_to_Julia_DFs)' 카테고리의 다른 글
13. extras (0) | 2018.10.20 |
---|---|
12. pitfalls (한글) (0) | 2018.10.19 |
11. performance (한글) (0) | 2018.10.18 |
11. performance (0) | 2018.10.18 |
10. transforms (한글) (0) | 2018.10.16 |