Tag Archives: IFTTT

Living in Berkeley

I’m now settled into a studio just south of the UC Berkeley campus. With a built-in secretary, a lock on just the bedroom side of the door to the kitchen, and a tight service stairway out of the kitchen, the apartment feels bizarrely colonial.

I’m only sometimes here though. I was just in NYC for a week, and I fly back for another week on Monday. After some prodding at my going-away party, I’m going to take these trips as an opportunity to get back into a little D&D. Here’s the idea for my nascent campaign:

The year is 500 BCE, and the Persian Empire is the crossroads of the world. This is not quite the ancient Persia of history books: it is a place of wonders and legend and secret crafts. But times are changing, whispered by sages and hinted in strange news from distant lands. They say that new gods are coming, old gods will fall, and it is time for everyone to collect their allies close for the coming chaos.

I’ve also been having some fun with GIS, to combine fantasy and history:

Guest Post: The trouble with anticipation (Nate Neligh)

Hello everyone, I am here to do a little guest blogging today. Instead of some useful empirical tools or interesting analysis, I want to take you on a short tour through of the murkier aspects of economic theory: anticipation. The very idea of the ubiquitous Nash Equilibrium is rooted in anticipation. Much of behavioral economics is focused on determining how people anticipate one another’s actions. While economists have a pretty decent handle on how people will anticipate and act in repeated games (the same game played over and over) and small games with a few different decisions, not as much work has been put into studying long games with complex history dependence. To use an analogy, economists have done a lot of work on games that look like poker but much less work on games that look like chess.

One of the fundamental problems is finding a long form game that has enough mathematical coherence and deep structure to allow the game to be solved analytically. Economists like analytical solutions when they are available, but it is rare to find an interesting game that can be solved by pen and paper.

Brute force simulation can be helpful. Simply simulating all possible outcomes and using a technique called backwards induction, we can solve the game in a Nash Equilibrium sense, but this approach has drawbacks. First, the technique is limited. Even with a wonderful computer and a lot of time, there are some games that simply cannot be solved in human time due to their complexity. More importantly, any solutions that are derived are not realistic. The average person does not have the ability to perform the same computations as a super computer. On the other hand, people are not as simple as the mechanical actions of a physics inspired model.

James and I have been working on a game of strategic network formation which effectively illustrates all these problems. The model takes 2 parameters (the number of nodes and the cost of making new connections) and uses them to strategically construct a network in a decentralized way. The rules are extremely simple and almost completely linear, but the complexities of backwards induction make it impossible to solve by hand for a network of any significant size (some modifications can be added which shrink the state space to the point where the game can be solved). Backwards induction doesn’t work for large networks, since the number of possible outcomes grows at a rate of (roughly) but what we can see is intriguing. The results seem to follow a pattern, but they are not predictable.

The trouble with anticipation

 

Each region of a different color represents a different network (colors selected based on network properties). The y-axis is discrete number of nudes in the network. The x axis is a continuous cost parameter. Compare where the color changes as the cost parameter is varied across the different numbers of nodes. As you can see, switch points tend to be somewhat similar across network scales, but they are not completely consistent.

Currently we are exploring a number of options; I personally think that agent-based modeling is going to be the key to tackling this type of problem (and those that are even less tractable) in the future. Agent based models and genetic algorithms have the potential to be more realistic and more tractable than any more traditional solution.

A month of nominal changes

I’ve been busy! In the last month, I have collected an appalling list of achievements which mean much to the world and very little to life as I live it.

First, I am a doctor, as of May 20. Not a real doctor, and Flame won’t let me wear a stethoscope anyway. But my program in sustainable development is officially over. Interestingly, this is nothing like job changes I have had before: I still work on the same projects and attend the same meetings with the same people. But in theory, I am now unemployed, and I will soon be a UC Berkeley employee with similarly slight impacts.

Second, I can now drive a car. Of course, I could before, and have been acceptably competent at it for the past six months. But the winter is a horrible time to take a road test, and New York City is a horrible place for one. My license was finally approved on Monday. I have yet to experience the joys or sorrows of driving alone, but I hear California is great for that.

I have also finished my Hepatitis A and B shot series and gotten a new Yellow Fever vaccination. I think I was already immune with the first shots, and only people who lose their international immunization card need a second Yellow Fever vaccine, but now I have paperwork for all three. And, twelve years out, I am not quite done with my student loans, but with $101.58 left, I might as well be.

Flame and I are now ensconced in a tiny apartment on the corner of Prospect Park, Brooklyn. Officially we have had the apartment for over a month, but we just changed residences last week. So, I suppose with all of the nominal changes, there are a few real ones too. It has been an exciting journey! But some time I will need at least a nominal vacation.

Google Scholar Alerts to RSS: A punctuated equilibrium

If you’re like me, you have a pile of Google Scholar Alerts that you never manage to read. It’s a reflection of a more general problem: how do you find good articles, when there are so many articles to sift through?

I’ve recently started using Sux0r, a Bayesian filtering RSS feed reader. However, Google Scholar sends alerts to one’s email, and we’ll want to extract each paper as a separate RSS item.

alertemail

Here’s my process, and the steps for doing it yourself:

Google Scholar Alerts → IFTTT → Blogger → Perl → DreamHost → RSS → Bayesian Reader

  1. Create a Blogger blog that you will just use for Google Scholar Alerts: Go to the Blogger Home Page and follow the steps under “New Blog”.
  2. Sign up for IFTTT (if you don’t already have an account), and create a new recipe to post emails from scholaralerts-noreply@google.com to your new blog. The channel for the trigger is your email system (Gmail for me); the trigger is “New email in inbox from…”; the channel for the action is Blogger; and the title and labels can be whatever you want as along as the body is “{{BodyPlain}}” (which includes HTML).

    ifttttrigger

  3. Modify the Perl code below, pointing it to the front page of your new Blogger blog. It will return an RSS feed when called at the command line (perl scholar.pl).

    rssfeed

  4. Upload the Perl script to your favorite server (mine, http://existencia.org/, is powered by DreamHost.
  5. Point your favorite RSS reader to the URL of the Perl script as an RSS feed, and wait as the Google Alerts come streaming in!

Here is the code for the Alert-Blogger-to-RSS Perl script. All you need to do is fill in the $url line below.

#!/usr/bin/perl -w
use strict;
use CGI qw(:standard);

use XML::RSS; # Library for RSS generation
use LWP::Simple; # Library for web access

# Download the first page from the blog
my $url = "http://mygooglealerts.blogspot.com/"; ### <-- FILL IN HERE!
my $input = get($url);
my @lines = split /\n/, $input;

# Set up the RSS feed we will fill
my $rss = new XML::RSS(version => '2.0');
$rss->channel(title => "Google Scholar Alerts");

# Iterate through the lines of HTML
my $ii = 0;
while ($ii < $#lines) {
    my $line = $lines[$ii];
    # Look for a <h3> starting the entry
    if ($line !~ /^<h3 style="font-weight:normal/) {
        $ii = ++$ii;
        next;
    }

    # Extract the title and link
    $line =~ /<a href="([^"]+)"><font .*?>(.+)<\/font>/;
    my $title = $2;
    my $link = $1;

    # Extract the authors and publication information
    my $line2 = $lines[$ii+1];
    $line2 =~ /<div><font .+?>([^<]+?) - (.*?, )?(\d{4})/;
    my $authors = $1;
    my $journal = (defined $2) ? $2 : '';
    my $year = $3;

    # Extract the snippets
    my $line3 = $lines[$ii+2];
    $line3 =~ /<div><font .+?>(.+?)<br \/>/;
    my $content = $1;
    for ($ii = $ii + 3; $ii < @lines; $ii++) {
        my $linen = $lines[$ii];
        # Are we done, or is there another line of snippets?
        if ($linen =~ /^(.+?)<\/font><\/div>/) {
            $content = $content . '<br />' . $1;
            last;
        } else {
            $linen =~ /^(.+?)<br \/>/;
            $content = $content . '<br />' . $1;
        }
    }
    $ii = ++$ii;

    # Use the title and publication for the RSS entry title
    my $longtitle = "$title ($authors, $journal $year)";

    # Add it to the RSS feed
    $rss->add_item(title => $longtitle,
                   link => $link,
                   description => $content);
        
    $ii = ++$ii;
}

# Write out the RSS feed
print header('application/xml+rss');
print $rss->as_string;

In Sux0r, here are a couple of items form the final result:

sux0rfeed

Scripts for Twitter Data

Twitter data– the endless stream of tweets, the user network, and the rise and fall of hashtags– offers a flood of insight into the minute-by-minute state of the society. Or at least one self-selecting part of it. A lot of people want to use it for research, and it turns out to be pretty easy to do so.

You can either purchase twitter data, or collect it in real-time. If you purchase twitter data, it’s all organized for you and available historically, but it basically isn’t anything that you can’t get yourself by monitoring twitter in real-time. I’ve used GNIP, where the going rate was about $500 per million tweets in 2013.

There are two main ways to collect data directly from twitter: “queries” and the “stream”. Queries let you get up to 1000 tweets at any point in time– whichever the most recent tweets that match your search criteria. The stream gives you a fraction of a percent of tweets continuously, which very quickly adds up, based on filtering criteria.

Scripts for doing these two options are below, but you need to decide on the search/streaming criteria. Typically, these are search terms and geographical constraints. See Twitter’s API documentation to decide on your search options.

Twitter uses an athentication system to identify both the individual collecting the data, and what tool is helping them do it. It is easy to register a new tool, whereby you pretend that you’re a startup with a great new app. Here are the steps:

  1. Install python’s twitter package, using “easy_install twitter” or “pip install twitter”.
  2. Create an app at http://ift.tt/1oHSTpv. Leave the callback URL blank, but fill in the rest.
  3. Set the CONSUMER_KEY and CONSUMER_SECRET in the code below to the values you get on the keys and access tokens tab of your app.
  4. Fill in the name of the application.
  5. Fill in any search terms or structured searches you like.
  6. If you’re using the downloaded scripts, which output data to a CSV file, change where the file is written, to some directory (where it says “twitter/us_”).
  7. Run the script from your computer’s terminal (i.e., python search.py)
  8. The script will pop up a browser for you to log into twitter and accept permissions from your app.
  9. Get data.

Here is what a simple script looks like:

import os, twitter

APP_NAME = "Your app name"
CONSUMER_KEY = 'Your consumer key'
CONSUMER_SECRET = 'Your consumer token'

# Do we already have a token saved?
MY_TWITTER_CREDS = os.path.expanduser('~/.class_credentials')
if not os.path.exists(MY_TWITTER_CREDS):
    # This will ask you to accept the permissions and save the token
    twitter.oauth_dance(APP_NAME, CONSUMER_KEY, CONSUMER_SECRET,
                        MY_TWITTER_CREDS)

# Read the token
oauth_token, oauth_secret = twitter.read_token_file(MY_TWITTER_CREDS)

# Open up an API object, with the OAuth token
api = twitter.Twitter(api_version="1.1", auth=twitter.OAuth(oauth_token, oauth_secret, CONSUMER_KEY, CONSUMER_SECRET))

# Perform our query
tweets = api.search.tweets(q="risky business")

# Print the results
for tweet in tweets['statuses']:
    if not 'text' in tweet:
        continue

    print tweet
    break

For automating twitter collection, I’ve put together scripts for queries (search.py), streaming (filter.py), and bash scripts that run them repeatedly (repsearch.sh and repfilter.sh). Download the scripts.

To use the repetition scripts, make the repetition scripts executable by running “chmod a+x repsearch.sh repfilter.sh“. Then run them, by typing ./repfilter.sh or ./repsearch.sh. Note that these will create many many files over time, which you’ll have to merge together.

US Water Network

The America’s Water project, coordinated at Columbia’s Water Center by Upmanu Lall, is trying to understand the US water system as an integrated whole, and understand how that system will evolve over the next decades. Doing so will require a comprehensive model, incorporating agriculture, energy, cities, policy, and more.

We are just beginning to lay the foundation for that model. A first step is to create a network of links between station gauges around the US, representing upstream and downstream flows and counties served. The ultimate form of that model will rely on physical flow data, but I created a first pass using simple rules:

  1. Every gauge can only be connected to one downstream gauge (but not visa versa).
  2. Upstream gauges must be at a higher elevation than downstream gauges.
  3. Upstream gauges must be fed by a smaller drainage basin than downstream gauges.
  4. Of the gauges that satisfy the first two constraints, the chosen downstream gauge is the one with the shortest distance and the most “plausible” streamflow.

The full description is available on Overleaf. I’ve applied the algorithm to the GAGES II database from USGSU, which includes all station gauges with at least 20 years of data.

network3

Every red dot is a gauge, black lines are upstream-downstream connections between gauges, and the blue and green lines connect counties with each of the gauges by similar rules to the ones above (green edges if the link is forced to be longer than 100 km).

This kind of network opens the door for a lot of interesting analyses. For example, if agricultural withdrawals increase in the midwest, how much less water will be available downstream? We’re working now to construct a full optimization model that accounts for upstream dependencies.

Another simple question is, how much of the demand in each county is satisfied by flows available to it? Here are the results, and many cities show up in sharp red, showing that their demands exceed the surface water by 10 times or more.

supplydemand-surface

Games from Mac Plus

About a decade ago, I got a 3.5″ floppy reader for my laptop, and every so often I’ve gone through a pile of disks seeing if anything is still readable and worth saving. I think those days are over– a metal disk protector is now stuck in the reader, and all the software available for Windows to read mac disks appears to be broken or commercial.

But my most recent pile brought back memories of many happy hours of simple and elegant games. Some day I’ll write about my latterday favorites (Armor Alley, Dark Castle, Prince of Persia) or the less-actiony BBS and World Builder games I also loved, but right now I’m remembering some space games that brought a particular joy.

Crystal Quest

Probably a precursor to Asteroids, a game made progressively more difficult by space creatures that appear first as curiosity, and eventually with furosity.

Continuum

A space game of puzzles, with a big library of widgets, and a builder of new levels.

Sitting on geodes

Sometimes I think that research is more like mining than maze-solving. Like art (I imagine), the gems that we are able to bring forth are buried inside of us. Each of us stands on a vast mineral deposit, the accumlated layers of our experiences and our unconscious foundation. By our 30’s, we’ve learned to grow a harvest in our top-soil, but we’ve also had a chance to dig deeper and get a sense of that wealth. One of the challenges of life is to ensure that we get to keep digging under our own feet.


From Saturday Morning Breakfast Cereal.

Some people pan for precious metals; others plan out whole quarries of ore. Research techniques (and philosophical modes, literary critique, drawing technique, etc.) allow us to mine at will, but each works best on certain kinds of stone. You can dig shallow, and strike oil or gas able to propell you through the economic world. You can dig deep, and get unique and precious gems, metamorphized by the heat and pressure of the unconscious mind. If you dig too deep, you hit an inpenetrable bedrock.

Me, I look for geodes. Each of these rough stones contains a cavity filled with crystals. You can tell a geode by its face, but you never know what’s inside until you break it open. I don’t like throwing away research projects, even if I don’t have time for them, because I still want to break them open. On the other hand, I know that the more I dig, the more geodes I can find. And so, I can choose to leave gems in the ground, waiting at unforetold depths.

New Years Resolutions

I love New Years resolutions. A ritual opportunity to adjust the choices that make up life. Like everyone, I struggle (read: give up frequently) on them, but part of the joy is to understand that process and resolve better.

I’m expecting a big semester, starting soon: my Complexity Science course, bigger and better; finishing my thesis; being substantively involved in three large projects and several small ones; and getting a job. My theory of organization this time is to schedule– my work days are specified to the hour on the projects I hope to finish by the end of the semester:
schedule

My resolutions are mostly following the same idea, recognizing time less as a limiting factor than as an organizing principle:

  • Additional morning exercise (15 min. / week)
  • Personal or professional blogging (30 min. / week)
  • Review my colleagues interests and activities (30 min. / week) [next year follow-up: usefully encode my network]
  • Write to distant friends (30 min. / week)
  • Deep reflection on goals and activities (1 hr. / week)
  • Go for a hike outside the city in every month [next year follow-up: hike the same trail every month of the year]
  • Read a journal cover-to-cover every week [next year follow-up: become a regular reader of one journal]