Excel Question

propanol · January 10, 2008, 9:46pm

Don’t know the exact question, but it does sound like a very simple integer programming problem, and MFE has provided the solution. One possible pitfall is that the answer may not be unique, ie there exists more than 1 combination of 01s that will give the sum. There are only 2^16 combinations, so you can check it deterministically in Excel, which provides 2^16 rows. Personally, I would simply write a simple vba that runs 5000 simulations (rnd number <0.5 => 0 or 1 otherwise) and record the 01 combination when its sumproduct equals the regional value. That should give you most, if not all, the possible 01s sequences.

jmerten · January 10, 2008, 10:35pm

MFE, that worked, pretty clever man… thanks.

JoeyDVivre · January 10, 2008, 10:59pm

I don’t know about those simple integer programming problems. They always seem pretty tough to me. I don’t see lots of the stuff mentioned above, but maybe its because I don’t know what the question is exactly. Let’s say that the question is the following: Each state has some X(i) = loan write-offs in state i There are some known number of regions, say 8, and each region has a regional write-off R(j) = sum(x(i)) where i in some set A(j). The goal is to determine the sets A(j). I agree with propanol that there is no reason to think there is a unique solution but if the data are sufficiently granular there should be just one (imagine write-offs to the penny and this is a really easy problem). This isn’t the knapsack problem because there is no maximization and there is lots more structure. Nearly ditto for integer programming, although you might be able to recast it that way. The first way I would go about solving this problem is to recognize that geography almost certainly matter. CA and CT are not in the same region. So start with the assumption that each state in a region shares a border with another state in the region. Now you have a pretty easy algorithm that is really easy if you have sufficiently granular data. Start with some corner state, say FL and then do a recursive search (ala, knight’s tour) until you have a match with one region. This shouldn’t take long at all because the border assumption limits your choices a ton (we’re talking seconds and 8 lines of code). Then pick a corner state adjacent to the region you just did and do it again. Wash, rinse, repeat. Now if the data is not very granular, you could wind up with a problem and you would have to recurse on regions. You could also do this problem without the border assumption using, say, a genetic algoithm which would almost surely smack it down. Just because a problem is NP-Hard or NP-complete doesn’t mean that you can’t have a really high probability of solving it in some polynomial time (although I would like to see someone calculate the probability a genetic algorithm solves a particular NP-hard problem in k generations, say). BTW - I think the best corner state to start on is Maine, because you are nicely going to pick off New england then maybe or maybe not NY.

alphabound · January 11, 2008, 12:25am

For the OP’s problem, I think MFE’s solution may be sufficient. I think the misnomer is the assumption that we can produce the required unique combination using simple programming techniques. The gamut of real life scenarios to which this methodology can be applied contains larger data sets that will return multiple combinations that will suffice at first glance but may not be exactly what you want. I stick by my initial assertion that this problem (fundamentally) is more complex than it seems. If anyone is interested I can post up some real world data tomorrow that will speak to this point.

JoeyDVivre · January 11, 2008, 12:30am

I wonder how solver did that… It’s a pretty amazing piece of software.

alphabound · January 11, 2008, 12:50am

I think I may need to clarify myself. I’m not debating that a solution can be found using solver, goal seek, or any other off the shelf product. I’m also not looking to go tit-for-tat with someone who I know has a better grasp of the underlying numerical methods than I do. I’m just stating that in the real world there can be multiple solutions that meet the criteria on the surface. I know this first hand. Again, I am speaking to the fundamental issue, not the OP’s specific question. I’d be happy to send you the code I tried to implement (the same I sent the OP) along with an example or two of why I gave up trying to automate the resolution of a similar problem. Your input would be welcomed since I couldn’t solve this on my own.

bchad · January 11, 2008, 1:05am

Maybe I don’t understand the problem, but if there is more than one solution to the problem (as I suspect there is), how do we keep MFE’s Solver from converging on some trivial solution?

alphabound · January 11, 2008, 1:19am

That’s the downside of off the shelf products as well as programmed solutions in general (in this case). I suspect it stems from this being a variation of the knapsack problem. Like Joey said, there is no maximization (or minimization) involved so a computer cannot identify the ideal solution. The limitations lie in the program’s ability to merely produce a set of possible combinations which satisfy the objective constraint (sum of combination = x). EDIT: in this case it may be fine, but in the real world there are pitfalls.

MFE · January 11, 2008, 2:38am

bchadwick Wrote: ------------------------------------------------------- > Maybe I don’t understand the problem, but if there > is more than one solution to the problem (as I > suspect there is), how do we keep MFE’s Solver > from converging on some trivial solution? you can’t. but I think you can continue to run the solver to see if another solution exists. the other question would jmerten be able to identify if the solution solver gave him was the correct one? It sounds to me like 2 variable-one equation dilemma.

jmerten · January 11, 2008, 11:43am

For my problem, the solution was unique, meaning that there wasn’t any other possibilities. That is why Solver was able to find the answer by some sort of decision tree type of process. Like MFE said, it took a while but it did find the correct combination. For more intense problems where multiple combinations would yield a correct solution you might have to use Alpha’s script where more “logic” exists and it would find multiple solutions.

JoeyDVivre · January 11, 2008, 12:54pm

The other thing that I believe is that you really can use genetic algorithms to solve all of these kinds of problems…

JoeyDVivre · January 11, 2008, 12:54pm

The other thing that I believe is that you really can use genetic algorithms to solve all of these kinds of problems…

JoeyDVivre · January 11, 2008, 12:54pm

The other thing that I believe is that you really can use genetic algorithms to solve all of these kinds of problems…

DarienHacker · January 11, 2008, 2:44pm

Most genetic algorithms suffer from lack of guarantees on speed of convergence (in fact most don’t offer convergence guarantees at all). They also offer no guarantees of optimality, can get stuck in local minima, etc. For mission-critical applications they’re generally the wrong approach. See http://en.wikipedia.org/wiki/Genetic_algorithm#Observations The usual approach to combinatorial problems is heuristic state-space search, which can be implemented in a way that guarantees optimality and convergence. (See e.g. http://en.wikipedia.org/wiki/Heuristic_function for an intro.) In jmerten’s case I suspect that the specific problem was small enough that Solver was able to luck into a solution fairly quickly. With larger problems there could be serious scaling issues.

JoeyDVivre · January 11, 2008, 3:26pm

Poof. You and everything you see are the result of a genetic algorithm. If it doesn’t work, it’s because the representation and operators weren’t right. Edit: My solution above for this problem is noit genetic and guaranteed to work, probably in a small amount of time. The problem is that the data set-up is a pain.

DarienHacker · January 11, 2008, 3:50pm

Genetic algorithm:state-space search :: neural nets:factor analysis Everything you think comes from a neural net – so there’s no need for statistics.

JoeyDVivre · January 11, 2008, 4:19pm

I guess I don’t get the analogy. Neural nets are a good solution to some problems - not as powerful as statistical methods when models and assumptions are true, but more flexible. Factor analysis is a moderately useful statistical technique used by anybody comfortable with fitting more parameters than they have data points. I can’t think of any problem where I would decide among neural nets, genetic algorithms, and factor analysis.

DarienHacker · January 11, 2008, 4:28pm

I’m saying that both neural nets and genetic algorithms are cast as alternatives to more traditional, if less flashy, techniques. Their performance and behavior are often poorer or not understood at all. In the neural net world I can’t remember the last time I saw an application that wasn’t better served by a straightforward statistical analysis. I’d say similar for genetic algorithms.

TheBigBean · January 11, 2008, 5:31pm

alphabound Wrote: > > A table is two dimensional. This problem has > sixteen potential dimensions. I agree with your > theory but I doubt there is a practical > application. The other solutions offered are better. However, the above is not strictly true. Say for example you have 10 banana prices and 10 orange prices and have a nice model that predicts the consequential price of gold with variations in the underlying fruit. Your table only needs 100 rows, numbered from 0-99. The first figure feeds banana prices into the model and the second oranges. In this case he would need a 16 digit binary number which I think you can generate with =DEC2BIN(ROW(),16) and copying down would give you all possibilities - I think, if there were enough rows. This would solve problems regarding unique solutions as it is exhaustive, but I don’t think there are enough rows, so the other solutions are better.

TheBigBean · January 11, 2008, 5:33pm

^ Actually a macro from 1 to 16! reporting correct results using the above would be pretty simple and would work I think.