If it won't be simple, it simply won't be. [Hire me, source code] by Miki Tebeka, CEO, 353Solutions

Friday, March 30, 2007

HTML Entities

Quick way to see how are all HTML entities are displayed in your browser:


from urllib import urlopen
import re
import webbrowser

W3_URL = "http://www.w3.org/TR/WD-html40-970708/sgml/entities.html"
FILE_NAME = "/tmp/html-entities.html"
find_entity = re.compile("!ENTITY\s+([A-Za-z][A-Za-z0-9]+)").search

fo = open(FILE_NAME, "wt")

print >> fo, "<html><body><table border=\"1\">"

for line in urlopen(W3_URL):
match = find_entity(line)
if match:
entity = match.groups()[0]
print >> fo, "<tr><td>%s</td><td>&%s;</td></tr>" % (entity, entity)
print >> fo, "</table></body></html>"
fo.close()

webbrowser.open(FILE_NAME)

Say "NO" to Internet Violence

I'll make an exception for Kathy Sierra, and post a non-technical entry.

Just say "NO" to any violence in the internet, make it a better place for all of us.

Kathy, I hope you'll find the strength to overcome this.

Tuesday, March 27, 2007

Pushing Data - The Easy Way

One of the fastest ways to implement "pushing data to a server" is to have a CGI script on the server and push data to it from the clients.

This way you don't need to write a server, design a protocol, ... Just use an existing HTTP server (such as lighttpd) with CGI.

CGI Script:
#!/usr/bin/env python

from cgi import FieldStorage
from myapp import do_something_with_data

ERROR = "<html><body>Error: %s</body></html>"

def main():
print "Content-Type: text/html"
print

form = FieldStorage()
data = form.getvalue("data", "")
key = form.getvalue("key", "").strip()
if not (key and data):
raise SystemExit(ERROR % "NO 'key' or 'data'")

try:
do_something_with_data(key, data)
except Exception, e:
raise SystemExit(ERROR % e)

print "<html><body>OK</body></html>"

if __name__ == "__main__":
main()

"Pushing" script:
#!/usr/bin/env python

from urllib import urlopen, urlencode

CGI_URL = "http://localhost:8080/load.cgi"
def push_data(key, data):
query = urlencode([("data", data), ("key", key)])
try:
urlopen(CGI_URL, query).read()
except IOError, e:
pass # FIXME: Handle error

def main(argv=None):
if argv is None:
import sys
argv = sys.argv

from optparse import OptionParser
from os.path import isfile, basename

parser = OptionParser("usage: %prog FILENAME")

opts, args = parser.parse_args(argv[1:])
if len(args) != 1:
parser.error("wrong number of arguments") # Will exit

filename = args[0]
if not isfile(filename):
raise SystemExit("error: can't find %s" % filename)

key = basename(filename)
data = open(filename, "rb").read()

push_data(key, data)


if __name__ == "__main__":
main()



(Thanks to Martin for the idea)

Wednesday, March 21, 2007

defaultdict

Python 2.5 has a defaultdict dictionary in the collections
module.
defaultdict takes a factory function in the constructor. This function
will create the default value each time you try to get a missing item.

Then you can write a word histogram function like this:
from collections import defaultdict
def histogram(text):
histogram = defaultdict(int) # int() -> 0
for word in text.split():
histogram[word] += 1
return histogram
Or, if you want to store the location of the words as well
def histogram(text):
histogram = defaultdict(list) # list() -> []
for location, word in enumerate(text.split()):
histogram[word].append(location)
return histogram

Blog Archive