1 May 2012

Quick Tip: Replace Values in Dataframe on Condition with Random Numbers

This one took me some time - though, in fact it is plain simple:
> options(scipen=999)
> (my_df <- data.frame(matrix(sample(c(0,1), 100, replace = T), 10, 10)))
   X1 X2 X3 X4 X5 X6 X7 X8 X9 X10
1   0  0  1  0  1  1  1  1  0   1
2   0  0  1  1  1  0  0  0  0   0
3   0  1  1  0  0  1  1  0  0   1
4   1  1  0  0  0  0  0  0  0   0
5   0  0  1  1  1  0  0  1  1   1
6   0  0  1  1  0  0  1  1  0   1
7   0  0  1  1  1  0  1  1  1   1
8   1  1  1  0  0  0  0  1  1   0
9   0  0  0  0  1  0  1  0  1   0
10  1  1  1  1  1  0  1  1  0   1
> my_df[my_df == 0] <- runif(sum(my_df==0), 0, 0.001)
> my_df
         X1       X2       X3       X4       X5       X6       X7       X8
1  0.000268 0.000926 1.000000 2.00e-05 1.000000 1.00e+00 1.00e+00 1.00e+00
2  0.000531 0.000882 1.000000 1.00e+00 1.000000 4.66e-04 3.96e-04 6.70e-04
3  0.000785 1.000000 1.000000 5.03e-04 0.000164 1.00e+00 1.00e+00 2.98e-04
4  1.000000 1.000000 0.000336 8.71e-04 0.000770 7.44e-05 6.49e-05 1.01e-04
5  0.000168 0.000674 1.000000 1.00e+00 1.000000 6.49e-04 2.26e-04 1.00e+00
6  0.000404 0.000950 1.000000 1.00e+00 0.000735 7.59e-04 1.00e+00 1.00e+00
7  0.000472 0.000516 1.000000 1.00e+00 1.000000 1.37e-04 1.00e+00 1.00e+00
8  1.000000 1.000000 1.000000 6.30e-06 0.000972 3.97e-04 5.46e-05 1.00e+00
9  0.000868 0.000577 0.000347 7.21e-05 1.000000 2.25e-04 1.00e+00 7.19e-05
10 1.000000 1.000000 1.000000 1.00e+00 1.000000 5.80e-05 1.00e+00 1.00e+00
         X9      X10
1  0.000880 1.00e+00
2  0.000754 7.99e-04
3  0.000817 1.00e+00
4  0.000982 7.85e-04
5  1.000000 1.00e+00
6  0.000104 1.00e+00
7  1.000000 1.00e+00
8  1.000000 9.43e-06
9  1.000000 7.79e-04
10 0.000099 1.00e+00

No comments :

Post a Comment