I have a large data set (800K rows), so I’m looking for 1) any solution to my problem, but 2) an efficient one would be better.
See the attached file …
I have columns A and B filled with binary data (ones and zeros). The ones and zeros come in bunches. The two columns are not correlated.
I want to fill column C from column A. If the cell in A is zero, set the cell in C to be zero (no problem). If the cell in A is a 1, start a cumulative sum in column C (so 0,1,1,1,0) in A converts to (0,1,2,3,0) in C. Again, not a problem.
Here’s the problem: I want to mask some of column C and put the result in column D. If there’s any cell in column B that’s a 1, and there’s a cumulative sum next to it in column C, then I want the entire set of cells in the cumulative sum to be reset to 0 in column D, including the rows before and after the one where there was a 1 in column B. In the example above, a (1,0,0,0,0) or (0,0,0,0,1) in B does nothing to C, but a (0,1,0,0,0), or a (0,0,1,0,0) or a ((0,0,0,1,0) would convert D to a (0,0,0,0,0). So do other combinations with more than one 1 in any of those three spots.
For me, the roadblock is that I never know how many cells in D — before the 1, or after the 1, shows up in B — need to be reset to 0.