Lists and dictionaries
Given a flat list, like
[key1, value1, key2, value2]
convert it to an alist or dictionary:
>>> toalist = lambda kvs: zip(kvs[0::2], kvs[1::2])
>>> toalist(range(4))
[(0, 1), (2, 3)]
>>> dict(toalist(range(4)))
{0: 1, 2: 3}
Convert a dictionary to a flat list:
>>> # dict to alist
... al = list({1:2,3:4}.iteritems())
>>> al
[(1, 2), (3, 4)]
>>> # alist to flat list
... reduce(lambda acc,t: acc + list(t), al, [])
[1, 2, 3, 4]
To tranpose a list of lists/tuples unpack as a list of function arguments and zip
zip(*mylist)
:
>>> l = list(enumerate("abcdef"))
>>> l
[(0, 'a'), (1, 'b'), (2, 'c'), (3, 'd'), (4, 'e'), (5, 'f')]
>>> # transpose list of lists/tuples
... zip(*l)
[(0, 1, 2, 3, 4, 5), ('a', 'b', 'c', 'd', 'e', 'f')]
>>> # once again
... zip(*_)
[(0, 'a'), (1, 'b'), (2, 'c'), (3, 'd'), (4, 'e'), (5, 'f')]
Flatten a list of lists:
>>> lofl = [[1,2], [3], [4,5]]
>>> import operator
>>> reduce(operator.add, lofl)
[1, 2, 3, 4, 5]
An alternative approach is to use
chain
from
itertools
(this works also on huge lists if used wisely!):
>>> list(itertools.chain(*lofl))
[1, 2, 3, 4]
Apply a function to
either an iterable (list, tuple)
or a scalar:
>> def fmap(f,xs):
... try: return map(f,xs)
... except TypeError: return f(xs)
...
>>> fmap(lambda x:x*x, range(5))
[0, 1, 4, 9, 16]
>>> fmap(lambda x:x*x, 5)
25
Strings and Unicode
Unicode stuff is
changing in 3.0. For earlier versions, it is important to distinguish strings (
"abc"
) and unicode strings (
u"abc"
). The former can be converted to the latter with
unicode()
:
>>> "абв"
'\xd0\xb0\xd0\xb1\xd0\xb2'
>>> u"абв"
u'\u0430\u0431\u0432'
>>> unicode("абв","utf8")
u'\u0430\u0431\u0432'
Please note there are 3 unicode symbols in the original literal and there are three values in the unicode string. This is how the strings are to be represented internally.
Any communication with an external world usually requires that unicode data is
encoded. There are various encodings, "UTF-8" is one of the most common. Any encoded input should be
decoded to be processed:
>>> "абв".decode("utf8")
u'\u0430\u0431\u0432'
>>> u"абв".encode("utf8")
'\xd0\xb0\xd0\xb1\xd0\xb2'
To live a long and happy life it is important to understand if you are working with an encoded data (practically binary data) or decoded unicode text.
To test if an object is a string (either ascii string or unicode), test if it is an instance of
basestring
:
>>> isinstance("abc",basestring)
True
>>> isinstance(u"abc",basestring)
True
>>> isinstance(42,basestring)
False
To convert to a string and from string (depends on type):
>>> str(42)
'42'
>>> unicode(42)
u'42'
>>> int("42")
42
>>> float("42")
42.0
Backporting to Python 2.4
With Python 2.5, 2.6 and even 3.0 around, I still need to make some scripts run with Python 2.4. Just two tricks, to make sqlite3 code work:
try:
import sqlite3
except:
from pysqlite2 import dbapi2 as sqlite3 # cheating with py2.4
and to make ElementTree work:
try:
import xml.etree.ElementTree as ET
except:
import cElementTree as ET # not xml.etree in py2.4, use celementtree