dataframe - How to transform particular rows into columns in R -
i'm new r , question might seem easy of you. have data this
> data.frame(table(dat),total) ageintervals mytest.g_b_flag freq total 1 (1,23] 0 5718 5912 2 (23,26] 0 5249 5579 3 (26,28] 0 3105 3314 4 (28,33] 0 6277 6693 5 (33,37] 0 4443 4682 6 (37,41] 0 4277 4514 7 (41,46] 0 4904 5169 8 (46,51] 0 4582 4812 9 (51,57] 0 4039 4236 10 (57,76] 0 3926 4031 11 (1,23] 1 194 5912 12 (23,26] 1 330 5579 13 (26,28] 1 209 3314 14 (28,33] 1 416 6693 15 (33,37] 1 239 4682 16 (37,41] 1 237 4514 17 (41,46] 1 265 5169 18 (46,51] 1 230 4812 19 (51,57] 1 197 4236 20 (57,76] 1 105 4031
as might have noticed age intervals start repeating on 11 row. need 10 rows , 0's , 1' in different columns. this
ageintervals 1 0 total 1 (1,23] 194 5718 5912 2 (23,26] 330 5249 5579 3 (26,28] 209 3105 3314 4 (28,33] 416 6277 6693 5 (33,37] 239 4443 4682 6 (37,41] 237 4277 4514 7 (41,46] 265 4904 5169 8 (46,51] 230 4582 4812 9 (51,57] 197 4039 4236 10 (57,76] 105 3926 4031
many thanks
this straightforward "long" "wide" transformation easy achieve reshape
base r:
reshape(mydf, idvar = c("ageintervals", "total"), timevar = "mytest.g_b_flag", direction = "wide") # ageintervals total freq.0 freq.1 # 1 (1,23] 5912 5718 194 # 2 (23,26] 5579 5249 330 # 3 (26,28] 3314 3105 209 # 4 (28,33] 6693 6277 416 # 5 (33,37] 4682 4443 239 # 6 (37,41] 4514 4277 237 # 7 (41,46] 5169 4904 265 # 8 (46,51] 4812 4582 230 # 9 (51,57] 4236 4039 197 # 10 (57,76] 4031 3926 105
other alternatives include:
reshape2
library(reshape2) dcast(mydf, ... ~ mytest.g_b_flag, value.var='freq')
tidyr
library(tidyr) spread(df, mytest.g_b_flag, freq)
update
this problem possibly avoidable in first place.
run following example code , compare output @ each stage:
## create sample data set.seed(1) dat <- data.frame(v1 = sample(letters[1:3], 20, true), v2 = sample(c(0, 1), 20, true)) ## view output dat ## happens when use `data.frame` on `table` data.frame(table(dat)) ## compare `as.data.frame.matrix` as.data.frame.matrix(table(dat)) ## total can added automatically `addmargins` as.data.frame.matrix(addmargins(table(dat), 2, sum))
Comments
Post a Comment