I have written a script that has to delete the rows that 20% of their cells are small, then 10. It works great on small data sets but for bigger it is useless. Please help me.
Here's my script:
Dataset & lt; -choose.files () Dataset & lt; -read.delim (dataset, header = TRUE, line (1: in length (dataset [, 1]) for {count & lt; -0 (1 in j: 1 = 1, sep = "\ t", Blank.lines.skip = TRUE) Length (dataset [i,])) {if (DataSet [i, j] & lt; 10 || is.na (dataset [i, j])) {count = count + 1 }} If (calculation & gt; 0.2 * length (DataSet [i,])) {DataSet = DataSet [-i,] Delete & lt; -delete + 1}}
This will solve your problem. As can leave.
dat < -data.frame (matrix (rnorm (100,10,1), 10)) bad & lt; -apply (dat, 1, function (x) {return ((sum (x <10, na.rm = TRUE) + sum (is.na (x))) gt; length (x) * 0.2)} ) Dat & lt; -dat [! Bad;]
Comments
Post a Comment