Using Decision Trees

A decision tree provides a rational method for choice in the presence of uncertainty.

 

Making Intelligent Bets

Decision trees provide a rational method for making choices when uncertainty about the outcomes can be quantified.  For example, suppose someone offers us a raffle ticket, at a cost of $1, for a chance to win a car that is valued at $100,000.  A prudent person purchases the proffered paper posterior to predicting a positive payback.  Should you buy the ticket?  Here’s how to find out!

We need to assign values to our variables.  We assign a value of $1 to cost, represented by c, otherwise we have chosen the “DO NOT” branch.  There is no “TRY” in this type of analysis.  The nerdy among you might catch the sci-fi reference.  The true geeks caught the reference in the sentence pointing out the more obvious reference two sentences back.  But, I digress, recursively . . .

The car value of $100,000 must be adjusted, since you know that a certain bankrupt uncle will force you to sell it in order to pay the tax bill.  By the time the state and city get their shares, a paltry $50,000 remains.  To most people that represents a lot of money, and a lot of people buy the ticket anyway without performing a formal analysis.  We keep our wallets in our pockets, and instead assign $50,000 to v.

Finally, you need to know the probability of winning.  Assuming a fair game and a one-ticket purchase, the probability of winning becomes the reciprocal of the number of tickets in the hopper at the time of the drawing.  Symbolically, p = 1/N.

Unfortunately, we cannot know N unless we obtain a count of the tickets in the hopper immediately before the drawing occurs.  How do we overcome this information deficit?  Estimate it.

  •  If we can find the number of tickets currently sold, divide this number by the amount of time that has elapsed, then multiply the result by the total length of time that tickets are being sold.  This method extrapolates the average rate of sales to the close of the sale period.
  • If the history of previous raffles indicates an almost certain chance of selling all tickets, then just find out how many tickets exist.
  • If the history of previous raffles indicates that a fixed number of tickets sell every year, use that number.  Various types of regression techniques could be used to extrapolate trends from previous years.

For the sake of argument, let us pretend that a generous dealer donated the car (also for tax purposes) to a poor rural community school, who raffles it off to buy iPads and install wireless networks on their campus.  Due to the rich prize, people came from miles around to buy, so our estimated ticket count is 50oo.  This means that p = 1/5000 = 0.0002.

Should we buy?  Our expected value is the sum of the outcomes multiplied by their probability.  We know that probabilities sum to one, so we have E = p(V-C)-(1-p)C=p V-C=0.0002*50000-1=$9.  Our expected payoff is $9, so we gladly buy a ticket.

Epilogue

You didn’t win, but you still gained more than $1 worth of warm fuzzies for helping the youth of your community compete in a globally connected world.  Now, fast-forward one year.  The new technology enabled the director of fundraising to learn that she needs to raise the price of tickets to maximize the profits.  The tax laws did not change, so next year the same dealer donates another car of equal value to the school.  What price should the director set for the tickets?

Naturally, the cost should lead to a neutral expected payout.  People buy raffle tickets for the warm fuzzies anyway, right?  With E=0, we rearrange the equation for expected value to p V = c.  Thus, the cost threshold for a rational purchase can be computed by multiplying the prize value by the probability of winning.  Our director assumes that she can sell 5000 tickets again, and sets the raffle ticket price at 0.0002*50000=$10.

As a matter of insurance, the director only prints 5000 tickets and announces this fact.  Neglecting inflation, the expected payout will never g0 negative, and thus any rational person of legal age with $10 to invest and no religious restrictions on gambling should buy a ticket.  How can this be?  This final exercise is left for the readers!

Tschüß!

 

Posted in Model Thinking | Leave a comment

Exploring Granovetter’s Model

Granovetter’s model for collective action uses thresholds to determine whether agents in a population “join the movement.”  The number of an agent’s peers who must participate in the action in question before said agent also begins to participate defines the thresholds.  The model was published in the following academic paper: Threshold Models of Collective Behavior, Granovetter, M.  The American Journal of Sociology, Vol. 83, No. 6 (May, 1978), 1420-1443.

What follows represents a simple example of the model’s operation, as presented in the paper cited above.  Let’s define the collective action as a fad: listening to dubstep music.  I won’t listen to dubstep until ten of my peers are listening to dubstep.  My “threshold” is therefore defined to be ten.  Furthermore, nine of my peers won’t listen to dubstep unless one of us also listens to dubstep.  Their thresholds are all one.  As a result, nobody listens to dubstep.  Now, imagine that a new person joins our peer group, and that person likes dubstep.  This first person’s threshold was zero.  Seeing this development, my nine impressionable peers begin listening to dubstep.  Now that ten of my peers are listening to dubstep, I download some dubstep and discover what I’m missing.  This example, while contrived, represents the type of group dynamics Granovetter’s model seeks to describe.  It can describe fads, riots and other collective actions.  As a matter of terminology, people with zero thresholds are called “instigators.”

Figure A: According to Granovetter's model, the equilibrium number of participants in a riot is given by the leftmost intersection of the cumulative distribution of thresholds (blue line) with the diagonal, shown here in red. (mean=25, sigma=10)

Granovetter sought to demonstrate that unanticipated collective actions can occur without changes in individual preferences.  Furthermore, the dynamics could be highly unstable.  To demonstrate this, he defined a simple crowd of 100 people, with the cumulative distribution of thresholds (CDT) to participate in a riot described by a normal distribution with mean=25 and variable sigma (Figure A).

Figure B: Close-up of equilibrium point for (25,10). Since the distribution intersects the diagonal at 0.769551, no riot occurs for a system described by this distribution.

Once an instigator acts, higher and higher thresholds are activated until reaching an equilibrium when the number of people rioting equals the CDT.  Thus, Figure B demonstrates that for (mean,sigma)==(25,10), no riot occurs because the equilibrium falls below a single rioter.  If something causes the width of the CDT to increase such that at least one person with zero threshold exists, then a riot will ensue.  The threshold for the appearance of an instigator is approximately sigma=10.3166 for a mean of 25.

 

Figure C: Increasing sigma to twelve causes results in an easily kettled riot of 4.02111 protestors.

Figure D: At sigma=12.22208, the CDT just touches the diagonal to give an equilibrium riot size of 6.18741.

Figure C demonstrates the small riot that results from increasing sigma to twelve.  Normally, we dismiss misbehavior by a small number of people in a crowd as reflecting poorly on the individual participants.  Granovetter’s model demonstrates the weakness of this conclusion.  Consider what happens when we increase sigma to 12.22208?  Figure D shows that for this value of sigma called the “critical point” the CDT just touches the diagonal, and the riot participation rate increases to 6.18741 rioters.  If sigma were instead just a tiny bit greater, say 12.22209, the leftmost intersection jumps to the far right of the graph and everyone joins in.  For any value of sigma greater than or equal to the critical sigma, the CDT does not cross the diagonal until after the mean (Figure E).  Thus, one population has six “bad eggs” and another becomes an angry mob despite possessing nearly identical CDTs.

Figure E: The lower "knee" of the sigmoidal CDT acts like a floodgate. If it doesn't cross the diagonal before the mean, the situation changes from a bad news segment to a full-blown riot.

 

We can think of the lower “knee” of the sigmoidal CDT as a sort of floodgate.  When this floodgate opens by rising above the diagonal, a full-blown riot ensues.  A plot of riot size vs. sigma is presented as Figure F.  These two graphs (E+F) present Granovetter’s key results:  the switch from minor disturbance to full-blown riot is quite sensitive to the distribution width, and occurs far below the mean riot threshold of the group.

Figure F: Riot size as a function of standard deviation of the CDT for a mean threshold of 25.

Oddly enough, further increases of sigma cause the number of rioters to go down.  This results from the same phenomenon that caused the sudden jump; specifically, the symmetrical knees of the normal CDF (Figure G).  In this figure, the green trace represents the case where everyone in the crowd possess approximately the same propensity to riot, and no riot occurs unless 25 or more people spontaneously riot.  The odds of this many instigators occurring for such a narrow distribution is infinitesimally low, so no riot is predicted.  The blue trace shows the critical case, where there are enough agents activated to cause a disturbance, but cool heads still prevail.  The orange line demonstrates how the riot size decreases as the width of the CDT is increased.

Figure G: Comparisons of various sigma values demonstrating three key results from Granovetter's normal CDT. (Key: green, sigma=1; blue, sigma=12.22; orange, sigma=50)

The statistically erudite reader may balk at the range mismatch between the normal distribution on infinite limits and our discrete basis set.  We’ll leave that observation as barroom argument fodder for pedants, and instead accept it for the moment and move on to the more important question:  What does all this mean?  For starters, it means that we must consider the distribution of propensities for collective action independently of how they are formed.  Granovetter presents cases where random fluctuations in the sample from the same distribution results in no riot, minor disturbance, and full-blown riot.  The “take-home message” states that the observation of a riot does not necessarily indicate that the participants are more riot-prone than the population at large.  It merely means that the distribution of thresholds were such that the actions of relatively few instigators started an unfortunate chain reaction in a metastable dynamical system.

Certainly, how one attains a certain threshold stands as an important question.  One could argue that understanding how thresholds are formed holds great promise as a method for shaping the distribution of thresholds, as the width of the distribution seems to be of greater import than the mean in the presence of instigators.  Nevertheless, Granovetter’s work shows that individual attitude formation and collective action participation should be treated as distinct, yet related, phenomena.

I will close this post by mentioning that I used Mathematica 8 Home Edition to solve the difference equations and construct the plots.  An archive of the notebook can be downloaded here.

Posted in Model Thinking | 2 Comments

I’m ready for some football!

I like American football.  Not enough to make a habit out of watching televised games.  Certainly not enough to warrant exposing myself to sub-freezing temperatures wearing nothing but paint from the waist up. Nevertheless, I appreciate the strategy, teamwork and physical prowess required by the game.  I rejoiced when, two years ago, the New Orleans Saints finally lost the derogatory nickname “Ain’t's” against an equally historically under-performing team, the Minnesota Vikings.  That win provided the counterexample to this admittedly stupid theory:

It seems that every time I move to a new town with an NFL franchise, that team makes it to the Super Bowl soon afterwards.  For example, I moved to Tampa in 1999.  That year, the Buccaneers won the NFC Central Division Championship.  In 2002, they won the Super Bowl.  In the nine months that I lived in Philadelphia during 2006-2007, the Eagles  nearly made it to the Super Bowl.  Of course, their record doesn’t compare to the Bucs’ horrendous history, and I moved to Boston in the spring of 2007.  Surprisingly enough, the Patriots won the Super Bowl in 2008.

My wife and I still live in the Greater Boston Metropolitan area.  The Patriots kick off in 90 minutes or so.  Synchronicity?

Nah….the Saints didn’t win until after I left home.  So much for that theory.

I’ll just keep watching the Super Bowl and Conference Championships.  Divination should be done by experts.

Posted in Philosophy | Leave a comment

2011 Reading List

At the beginning of 2011, I resolved to watch less television and read more books.  As the year draws to a close, I think I kept my resolution.  Please allow me to present my reading list. Evaluations are indicated as one to five stars following the title, and represent little more than my enjoyment of the work.

Books I started reading in 2010 and finished this year:

Books I started and finished in 2011:

Currently reading:

With a few exceptions, contemporary television programming erodes our culture.  The Kindle is handy, but I’ve grown to appreciate printed books.  I plan to carry over last year’s resolution into 2012.

Posted in Philosophy | Leave a comment

Moore’s Law — still in force

Recently, I purchased a new 13″ MacBook Pro with a 2.8 GHz Core i7 processor and 4 GB DDR3 at 1-1/3 GHz.  This machine amazes me.  Two cores with four virtual processors.  It beats the second-generation Mac Pro I use at work by nearly 27% according the the spot-check I did with a Cython-accelerated audio codec unit test.  I suspect that a real benchmark will widen the gap.

Apple made things difficult for numerical applications by removing gcc AND gfortran from their latest XCode Tools (4.2).  To make matters worse, the new architecture breaks a lot of configure scripts.  The Core i7 is so new, in fact, that some bleeding-edge repository sources set -faltivec.  What a blast from the past!  Nevertheless, after several long weekends and three erase+re-install cycles, I have Python 2.7.2iPython, MatPlotLib, SciPy, Cython and scikits.audiolab configured.  Some unit tests fail, but they appear to be harmless.  Writing unit tests for key functionalities of third-party libraries minimizes the risk. I generated copious notes during the process, so I’ll post a how-to or two (or more) once I’ve had time to distill them.

The state of virtualization technology caught me by surprise.  After wrestling with the languorous Android VM on my Core 2 Duo MacBook,  VirtualBox was a breath of fresh air.  Full screen Ubuntu 11.10 on my new laptop seems faster than Ubuntu 10.04 LTS running natively on a first-generation MacBook.  Hopefully, time will permit some experimentation with computational benchmarks.

Posted in Technology | Leave a comment

Euthyphro

Euthyphro find Socrates outside court after being indicted by Meletus for corrupting the youth of Athens.  After a brief discussion of the charges, Socrates questions Euthypro about his own case.  The conversation leads Euthyphro into a discussion of the definition of piety.  Here we see a glimpse of the real impetus behind his prosecution.

The subject of ethics warrants the attention of scholars and philosophers to this day, but my thesis does not concern the subject matter itself.  My focus, rather, falls on the manner in which Socrates dissects Euthyphro’s arguments, leading him around to contradiction then back to his original position.  All the while, Socrates’s mocking tone grinds upon the nerves.

Ethics are a complex and uncertain topic, for certain.  Nevertheless, few people like being bested,  and even fewer take kindly to deliberate attempts to be proven wrong in a casual conversation.  I seriously doubt that ancient Athenians differed from us in this regard.

While a cadre of loyal admirers were inclined to seek out his tutelage, I suspect that more than a few powerful figures wished to be rid of him.  Socrates himself admitted this in his defense (Apology).  Despite a clear and compelling argument for acquittal, the jury handed down a verdict of guilty:  Socrates had to go.

Posted in Philosophy | Leave a comment

RMI Detour

I’ve been working through the Head First Design Patterns book, and so far it’s been great to follow along by typing up the memorably silly examples and running them.  Even the best textbooks contain errors and omissions, however, and this book has a few.  I ran into a particularly tough snag in chapter eleven, and decided to post the solution since Google found nothing resembling a recipe.

Chapter eleven, The Proxy Pattern, uses the Java Remote Method Invocation, or RMI, package to implement a “Remote Proxy” for a gumball machine.  The RMI Detour section presents a simple “Hello World” client-server application to bring those of us unfamiliar with RMI up-to-speed.  The RMI compiler uses the server class file to implement the Remote interface “stub” and “skeleton” classes.  The snag is that it does not work as presented in the book.

I found the solution in another O’Reilly book, Learning Java.  The solution requires that both the client and the server install security managers into their respective JVMs.  I inserted the following code into main() for both the client and the server classes:

After using ‘rmic’ to generate the stub from the MyRemoteImpl class and starting the RMI registry using ‘rmiregistry’, attempting to bind the service caused a java.security.AccessControlException to be thrown:

Command line input and output for failure to start service.

We also need a security policy file to give the classes permission to talk to objects in another Java virtual machine (JVM).  There is a ‘policytool’ application that ships with the JDK, but in the interest of simplicity I just created a policy file that granted an all-access pass to its bearer.

Output of 'cat mysecurity.policy'

This policy file must be declared to each JVM on the command line.  With the new pieces in place, I recompiled the client and server classes, re-ran the RMI compiler, started the server, and re-ran the client program a few times.

Screenshot showing command line and output.Coolness!  The server received and logged the command, and the stub returned the logged string to the client as if it had directly called sayHello() on the server itself.  Everything works.  Now, back to checking gumball inventories at multiple sites from the comfort of my office.

Posted in Technology | Leave a comment

Philosophy Book Club

We’ll read some of the works of Plato, Descartes and Spinoza, and meet regularly for discussion.  For starters, I recommend the following introductory booklets:

Annas, Julia.  Plato:  A Very Short Introduction.  ISBN 978-0-19-280216-3

Sorrell, Tom.  Descartes:  A Very Short Introduction.  ISBN 978-0-19-285409-4

Scruton, Roger. Spinoza:  A Very Short Introduction.  ISBN 978-0-19-280316-0

I have finished the Plato and Descartes booklets along with the first four chapters of the Spinoza booklet (fascinating, btw). I intend follow the booklets with Hackett’s Plato:  Five Dialogues, Second Edition (ISBN 0-87220-633-5) and the Penguin Classics reissued second edition of The Republic (ISBN-13 978-0-140-44914-3).  I also want to read Plato’s Theatetus and Timeus, but I have not yet decided on the specific translations.  I may take the easy route and purchase Hackett’s Plato: Complete Works.  I’ll suggest the Descartes and Spinoza readings when I finish their respective short introduction booklets.

Expect challenging and deep contemplation of ideas that exert profound effects on western culture to this very day.  I want to focus on epistemological and cosmological topics, but welcome anything of interest that the readings inspire.

Still here?  Good.  You know how to reach me :)

Posted in Philosophy | Leave a comment

Nutritious Daquiri

When I get home late and hungry, but don’t want a heavy meal close to bedtime, this liquid dinner comes to my rescue.

Ingredients:

1 banana

6-8 very cold frozen strawberries

8 oz cold water

2 scoops Designer Whey vanilla-flavored powder

3 teaspoons orange-flavored Metamucil

Equipment:

Blender with ice-crush mode.

Procedure:

Measure the water into the blender.  Peel the banana, break it into a few pieces and put it in the blender.  Put the strawberries into the blender.  Put the lid on and hold it down.  Use the pulse/ice crush mode until the blender can run at a low speed without splashing.  With the blender running at a low or moderate speed, add the powdered ingredients and blend until the streaking disappears.  Pour into a glass and enjoy.

Makes 1-4 servings, depending upon how hungry you are.  I tend to drink the whole thing.  I haven’t tried adding alcohol, but I reckon that 2-4 shots of rum will not spoil the flavor.

Food hacking is fun!

Posted in Recipes | Leave a comment

Piña Colada Popsicles

By popular demand, my soon-to-be obscure piña colada recipe:

Ingredients:

15 oz can of Coco Lopez cream-of-coconut

15 oz not-from-concentrate pineapple juice

15 oz of dark rum (not spiced)

1.5 quart tub of Breyer’s all-natural vanilla ice cream

Equipment:

Blender, two one-gallon containers, spoon, popsicle molds and sticks, designated driver.

Procedure:

Open can of Coco Lopez, persuade the goop to get into the blender with a spoon.  Fill the can with pineapple juice, stir to dissolve the goop left on the sides of the can, then pour into the blender.  Fill the can with rum, stir again, then pour into the blender.  Run the blender until the liquid looks homogeneous.

Unless you have a really large blender you need to pour two-thirds of the mixture into one of the containers.  With one pint of the mixture in the blender, add one pint of ice cream and blend until smooth, then pour into the second container.  Repeat until you run out of ingredients.  Stir the final mixture thoroughly.  At this point, you can serve (or drink) the piña coladas as they are.  It makes about twelve 8-oz servings.  Just remember that there are fifteen ounces of rum in the mixture.  I haven’t worked out the caloric content, but I can guess that it is astronomical.

After making too much for a recent party, I froze the excess and found that it makes a tasty dessert.  The ice crystals impart a crunchy texture that tickles your tongue while the concentrated flavor of rum complements the sweetness of tropical fruits, and the alcohol numbs the brain freeze for a pleasant buzz.  Just pour the mixture into popsicle molds, stick it in a cold freezer for a while, then enjoy.

I didn’t time how long it took to set.  If time is a concern, liquid nitrogen flash-freezing might work, but a styrofoam cooler full of dry ice would be less dangerous.  It would also keep the goodies frozen at a picnic, but could result in frostbite if not handled carefully.

Posted in Recipes | Leave a comment