Personal Web Page

Thursday, July 30, 2009

 

Lazy Primes Sieve in Python

I read this interesting paper, The Genuine Sieve of Eratosthenes (PDF) by Melissa O'Neill, when it was discussed on Lambda the Ultimate. I came back to look at it again today because I wanted to improve my Haskell fluency, which I only seem to exercise when I read papers like this. I decided to try to correctly construe the Haskell code in the paper by implementing the algorithms in Python. I focused on reproducing the laziness but not the functional style because I could substitute iterators for lazy lists but had to use conventional control structures for all the pattern matching and tail calls.


#!/usr/bin/env python
# Author: Jared Brothers
#
# A Python version of the Haskell code from
# "The Genuine Sieve of Eratosthenes"
# www.cs.hmc.edu/~oneill/papers/Sieve-JFP.pdf
#
import heapq, itertools, operator

def sieve(xs):
"""Generate the prime numbers, given an iterable of candidate numbers.
Cross off multiples of prime numbers incrementally using iterators."""
table = []
while 1:
x = xs.next()
if table == [] or x < table[0][0]:
yield x
xs, ys = itertools.tee(xs)
timesx = (lambda x: lambda y: x*y)(x)
heapq.heappush(table, (x**2, itertools.imap(timesx, ys)))
else:
while table[0][0] <= x:
heapq.heapreplace(table, (table[0][1].next(), table[0][1]))

def wheel(factors=[2, 3, 5, 7], next=11):
"""Generate the distances between numbers not divisible by a list of small
primes, from the next prime up to the product of the list."""
circumference = reduce(operator.mul, factors)
prev = next
next += 1
end = next + circumference
while next < end:
if not any(next % factor == 0 for factor in factors):
yield next - prev
prev = next
next += 1

def spin(factors=[2, 3, 5, 7], next=11):
"""Generate candidates by making a wheel and cycling through it."""
for gap in itertools.cycle(wheel(factors, next)):
yield next
next += gap

def primes(k=5):
"""Generate primes with the sieve and wheel factorization, which filters
multiples of the first k primes."""
smallprimes = list(itertools.islice(sieve(itertools.count(2)), k + 1))
factors = smallprimes[:-1]
next = smallprimes[-1]
return itertools.chain(factors, sieve(spin(factors, next)))

Labels: , , ,


Wednesday, June 24, 2009

 

Code deployment with Puppet

I fielded a question from Matt on the NYC*BSD discussion list about using Puppet to do code deployment from a subversion repository to his web servers. Some time ago I came up with a nice way to do this but hadn't published the code for it anywhere. This works for me, but tmtowtdi and ymmv.

I have developers create a tag for the version they want deployed, then I define in the web server manifest (which is in subversion itself) an instance of a custom resource that manages subversion working copies given the url of the repository, which branch to use, and the path of the copy directory.

Using a working copy rather than an export makes upgrading code fast but leaves .svn directories lying around, so I have Apache configured to hide them with 404s. The code is short and simple, since it just abuses exec to run the svn binary to do a check out, check which version is in place, and switch to a newer branch if necessary.


# web-foo.pp
class web-foo {
if $branch {
subversion::workingcopy { foo:
repourl => 'http://svn.example.com/foo/tags',
branch => "$branch",
copydir => '/www/foo',
}
}
}

# modules/subversion/manifests/init.pp
class subversion {
package { subversion:
ensure => installed,
}
}
define subversion::workingcopy ($repourl, $branch, $copydir) {
include subversion
file { "$copydir":
owner => apache, group => apache, mode => 755,
ensure => directory,
}
# initial check out
exec { "svnco-$name":
command => "/usr/bin/svn co --non-interactive $repourl/$branch $copydir/$name",
creates => "$copydir/$name/.svn",
require => [Package[subversion], File["$copydir"]],
}
# switch branches
exec { "svnswitch-$name":
command => "/usr/bin/svn switch --non-interactive $repourl/$branch $copydir/$name",
unless => "/usr/bin/svn info $copydir/$name|/bin/grep -q 'URL:.*$branch\$'",
require => [Package[subversion], File["$copydir"], Exec["svnco-$name"]],
}
}


When the developers want me to deploy a new tagged version, I just change the value of the $branch variable to the name of the new tag. I use an external classifier, iClassify, which is a Ruby on Rails application that serves as a nice tagging interface for mapping nodes to classes and setting the values of variables. On its next run, Puppet sees that the working copy is the wrong version and switches to the new one. So, basically this provides declarative configuration of what code is on which servers with point and click deployment of new versions.

Labels: , ,


Sunday, January 4, 2009

 

Domino counting problem

My brother is writing a combinatorics paper about playing with dominoes. Pure mathematics is a lot of fun that way, but with more Hamiltonian paths of graphs embedded on a torus. Here is a Python script I wrote for him last night to solve a counting problem.


#!/usr/bin/env python
# Author: Jared Brothers
# Find the orderings of a set of dominoes such that
# adjacent dominoes share a number and the other two
# numbers differ by one.

def search(start):
""" Depth first search, generating all solutions. """
fringe = [start]
while fringe:
s = fringe.pop()
if s.check():
yield s
fringe.extend(s.successors())

class state():
""" A state in the search space. """
def __init__(self, d, r):
""" The ordered dominoes are in done (list),
and the unordered dominoes are in rest (set). """
self.done = d
self.rest = r

def __str__(self):
return str(self.done)

def check(self):
""" Is this a solution? """
return not self.rest

def successors(self):
""" The states you can get to by moving a domino from rest to done. """
if self.done:
# Try only dominoes adjacent to the previous one,
a,b = self.done[-1]
a1 = (a + 1) % n
a2 = (a - 1) % n
b1 = (b + 1) % n
b2 = (b - 1) % n
adjs = set([(a,b1), (b1,a), (a,b2), (b2,a), (b,a1), (a1,b), (b,a2), (a2,b)])
for x in self.rest.intersection(adjs):
d = self.done[:]
d.append(x)
r = self.rest.copy()
r.discard(x)
yield state(d, r)
else:
# or all the rest if nothing has been done yet.
for x in sorted(self.rest):
d = [x]
r = self.rest.copy()
r.discard(x)
yield state(d, r)

if __name__ == '__main__':
n = 7 # The number of numbers. The number of dominoes is sum(range(n + 1)).
start = state([], set((a, b) for a in range(n) for b in range(a + 1)))
for s in search(start):
print(s)

Labels: ,


Wednesday, December 17, 2008

 

Pull A String, A Puppet Moves

each man must realize
that it can all disappear very
quickly:
the cat, the woman, the job,
the front tire,
the bed, the walls, the
room; all our necessities
including love,
rest on foundations of sand -
and any given cause,
no matter how unrelated:
the death of a boy in Hong Kong
or a blizzard in Omaha …
can serve as your undoing.
all your chinaware crashing to the
kitchen floor, your girl will enter
and you’ll be standing, drunk,
in the center of it and she’ll ask:
my god, what’s the matter?
and you’ll answer: I don’t know,
I don’t know …

-Charles Bukowski

from Burning in Water Drowning in Flame. Not to be confused with:
Pulling Strings With Puppet: Configuration Management Made Easy

Labels: ,


Tuesday, December 9, 2008

 

This superblock wants to tell you something

The problem is, when you reboot a Linux server, it will probably be back up quickly, but it may want to check all the filesystems, which takes substantially longer. How do you know what to expect? The conditions that can trigger a fsck (the time and number of mounts since the last one) are stored in the superblock, which is annoying to inspect before planning an outage.

It doesn't seem worth doing the checks myself every time, but it would be nice to know. I didn't follow up on this until someone else complained about it, convincing me there was enough demand to justify a better user interface. Now I install the following script as a daily cron job with standard output redirected to the message of the day file. I decided to use a few older idioms to support Python 1.5.2 installed on Red Hat 7.3 (Valhalla). I would have just used awk if it had a decent date parsing library.


#!/usr/bin/env python
# Author: Jared Brothers
#
# Which ext2/3 filesystems will require a fsck, delaying the next reboot?
#
# Inspect the superblock of mounted filesystems with debugfs(8)
# to check whether the max mount count or time interval have been
# exceeded. List the overdue filesystems' mount points.
#

import os, sys, re, time, popen2, string
fsck = []
debug = 0
name = os.path.basename(sys.argv[0])

if os.geteuid() != 0:
sys.exit(name + ": not run as root")

fd1, fd0, fd2 = popen2.popen3("/bin/mount")
for line in fd1.readlines():

if re.search("type (ext2|ext3)", line):
fields = string.split(line)
dev = fields[0]
if debug: print("dev: " + dev)

fs = fields[2]
if debug: print("fs: " + fs)

fd1, fd0, fd2 = popen2.popen3("/sbin/debugfs -R show_super_stats " + dev)
for line in fd1.readlines():
fields = string.split(line)

if re.search("^Mount count:", line):
count = int(fields[2])

if re.search("^Maximum mount count:", line):
max = int(fields[3])

if re.search("^Last checked:", line):
last = string.strip(string.join(fields[2:], " "))

if re.search("^Check interval:", line):
check = int(fields[2])

if max != -1:
left = max - count
if debug: print("left: " + str(left))

if check != 0:
next = time.mktime(time.strptime(last)) + check
if debug: print("next: " + time.asctime(time.localtime(next)))

if max != -1 and left <= 0 or check != 0 and next <= time.time():
fsck.append(fs)

if len(fsck):
print("These ext2/3 filesystems are due for a fsck: "
+ string.join(fsck, ", "))

Monday, November 24, 2008

 

Assuming you have a terrycloth robe

On Halloween, I heard, "That guy is awesome," as someone passed me in the train station on my way to the office in the morning. Arthur Dent is an effortless costume, so I'm surprised anyone was impressed. You just wear your pyjamas all day and set the background of your phone to say "Don't Panic" in large friendly letters. Maybe he thought I was dressed as the Dude.



Just two things differentiate an iPhone from a canonical copy of The Hitchhiker's Guide to the Galaxy: our species' parochial travel habits and the all too limited lifespan of our cell phone batteries. If we were to make the advances in physics and engineering necessary to produce cell phone batteries that last more than a day and interstellar travel that took less than a day, there would be huge demand for Lonely Exoplanet guides in the App Store.

But in the meantime, I recommend getting the free Z-machine interpreter iPhone application, Frotz, and trying to puzzle out Douglas Adam's brilliant Infocom game, also called Hitchhiker's Guide to the Galaxy.

Labels: , ,


Thursday, November 20, 2008

 

A Mac running Vista is a Sad Mac

My dad asked me recently what kind of small form-factor PC he should buy. My response was highly biased by my interest in his switching to a platform I have more experience with. I'm not enthusiastic about offering desktop support, even for my family, but they could at least notice that I haven't used Windows much lately.

He's too set in his ways to switch to Linux, so I volunteered to buy him a Mac Mini and set it up to dual-boot Vista, in the hope he would eventually try out Leopard. I had a copy of Vista that I won at a Microsoft recruiting event, which I subsequently failed to sell on eBay. I had also won a redundant iPod Nano from Google in their Treasure Hunt contest. Every Mac should come with a free iPod, so I pitched that in the deal too.

Next to all these software companies wasting perfectly good schwag on me, the saddest part of the story is that, from the resigned tone of his complaints about Vista, it sounds like he's going to keep running Windows.

Labels: , , , ,


My Photo
Name: Jared Brothers
Location: New York, NY, United States

Links

Archives

November 2008   December 2008   January 2009   June 2009   July 2009  

This page is powered by Blogger. Isn't yours?

Subscribe to Posts [Atom]