Initial push...
This commit is contained in:
651
src/Python/Python Performance Tips/Python Performance Tips.html
Normal file
651
src/Python/Python Performance Tips/Python Performance Tips.html
Normal file
@@ -0,0 +1,651 @@
|
||||
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
|
||||
<html><head>
|
||||
<title>Python Performance Tips</title>
|
||||
<meta http-equiv="Content-Type" content="text/html; charset=windows-1252">
|
||||
</head>
|
||||
|
||||
|
||||
<body>
|
||||
|
||||
|
||||
<p align="center">
|
||||
<script type="text/javascript">
|
||||
<!--
|
||||
google_ad_client = 'pub-7625309919520871';
|
||||
google_ad_width = 468;
|
||||
google_ad_height = 60;
|
||||
google_ad_format = '468x60_as';
|
||||
// -->
|
||||
</script>
|
||||
<script type="text/javascript">
|
||||
src="http://pagead2.googlesyndication.com/pagead/show_ads.js">
|
||||
</script>
|
||||
</p>
|
||||
|
||||
<h1 align="center">Python Performance Tips</h1>
|
||||
|
||||
<p> This page is devoted to various tips and tricks that help improve the
|
||||
performance of your Python programs. Wherever the information comes from
|
||||
someone else, I've tried to identify the source. <strong>Note: </strong>I
|
||||
originally wrote this in (I think) 1996 and haven't done a lot to keep it
|
||||
updated since then. Python has has changed in some significant ways since
|
||||
then, which means that some of the orderings will have changed. You should
|
||||
always test these tips with your application and the version of Python you
|
||||
intend to use and not just blindly accept that one method is faster than
|
||||
another.</p>
|
||||
|
||||
<p> Also new since this was originally written are packages like <a href="http://www.cosc.canterbury.ac.nz/~greg/python/Pyrex/">Pyrex</a>, <a href="http://psyco.sourceforge.net/">Psyco</a>, <a href="http://www.scipy.org/site_content/weave">Weave</a>, and <a href="http://pyinline.sourceforge.net/">PyInline</a>, which can dramatically
|
||||
improve your application's performance by making it easier to push
|
||||
performance-critical code into C or machine language.</p>
|
||||
|
||||
<p> If you have any light to shed on this subject, <a href="mailto:skip@pobox.com">let me know</a>.</p>
|
||||
|
||||
<h2>Contents</h2>
|
||||
<ul>
|
||||
<li><a href="#profiling">Profiling Code</a>
|
||||
<ul>
|
||||
<li><a href="#profile">Profile Module</a></li>
|
||||
<li><a href="#trace">Trace Module</a></li>
|
||||
</ul>
|
||||
</li>
|
||||
<li><a href="#sorting">Sorting</a></li>
|
||||
<li><a href="#stringcat">String concatenation</a></li>
|
||||
<li><a href="#loops">Loops</a></li>
|
||||
<li><a href="#dots">Avoiding dots...</a></li>
|
||||
<li><a href="#local">Local Variables</a></li>
|
||||
<li><a href="#initdict">Initializing Dictionary Elements</a></li>
|
||||
<li><a href="#import">Import Statement Overhead</a></li>
|
||||
<li><a href="#setgetdel">Using <code>map</code> with Dictionaries</a></li>
|
||||
<li><a href="#aggregate">Data Aggregation</a></li>
|
||||
<li><a href="#periodic">Doing Stuff Less Often</a></li>
|
||||
<li><a href="#notc">Python is not C</a></li>
|
||||
</ul>
|
||||
|
||||
<h2><a name="profiling">Profiling Code</a></h2>
|
||||
|
||||
<p> The first step to speeding up your program is learning where the
|
||||
bottlenecks lie. It hardly makes sense to optimize code that is never
|
||||
executed or that already runs fast. I use two modules to help locate the
|
||||
hotspots in my code, profile and trace. In later examples I also use the
|
||||
<code>timeit</code> module, which is new in Python 2.3.</p>
|
||||
|
||||
<h3><a name="profile">Profile Module</a></h3>
|
||||
|
||||
<p> The <a href="http://www.python.org/doc/current/lib/module-profile.html">profile
|
||||
module</a> is included as a standard module in the Python distribution.
|
||||
Using it to profile the execution of a set of functions is quite easy.
|
||||
Suppose your main function is called <code>main</code>, takes no arguments
|
||||
and you want to execute it under the control of the profile module. In its
|
||||
simplest form you just execute</p>
|
||||
|
||||
<pre>import profile
|
||||
profile.run('main()')
|
||||
</pre>
|
||||
|
||||
<p>When <code>main()</code> returns, the profile module will print a table
|
||||
of function calls and execution times. The output can be tweaked using the
|
||||
Stats class included with the module. In Python 2.4 profile will allow the
|
||||
time consumed by Python builtins and functions in extension modulesto be
|
||||
profiled as well.</p>
|
||||
|
||||
<h3><a name="hotshot">Hotshot Module</a></h3>
|
||||
|
||||
<p>New in Python 2.2, the <a href="http://www.python.org/doc/current/lib/module-hotshot.html">hotshot
|
||||
package</a> is intended as a replacement for the profile module. The
|
||||
underlying module is written in C, so using hotshot should result in a much
|
||||
smaller performance hit, and thus a more accurate idea of how your
|
||||
application is performing. There is also a <code>hotshotmain.py</code>
|
||||
program in the distributions <code>Tools/scripts</code> directory which
|
||||
makes it easy to run your program under hotshot control from the command
|
||||
line.</p>
|
||||
|
||||
<h3><a name="trace">Trace Module</a></h3>
|
||||
|
||||
<p>The trace module is a spin-off of the profile module I wrote originally
|
||||
to perform some crude statement level test coverage. It's been heavily
|
||||
modified by several other people since I released my initial crude effort.
|
||||
As of Python 2.0 you should find trace.py in the Tools/scripts directory of
|
||||
the Python distribution. Starting with Python 2.3 it's in the standard
|
||||
library (the Lib directory). You can copy it to your local bin directory
|
||||
and set the execute permission, then execute it directly. It's easy to run
|
||||
from the command line to trace execution of whole scripts:</p>
|
||||
|
||||
<pre>% trace.py -t spam.py eggs
|
||||
</pre>
|
||||
|
||||
<p>There's no separate documentation, but you can execute "pydoc trace" to
|
||||
view the inline documentation.</p>
|
||||
|
||||
<h2><a name="sorting">Sorting</a></h2>
|
||||
|
||||
<p> From <a href="mailto:guido@python.org">Guido van Rossum
|
||||
<guido@python.org></a></p>
|
||||
|
||||
<p>
|
||||
Sorting lists of basic Python objects is generally pretty efficient. The
|
||||
sort method for lists takes an optional comparison function as an argument
|
||||
that can be used to change the sorting behavior. This is quite convenient,
|
||||
though it can really slow down your sorts.</p>
|
||||
|
||||
<p> An alternative way to speed up sorts is to construct a list of tuples
|
||||
whose first element is a sort key that will sort properly using the default
|
||||
comparison, and whose second element is the original list element. This is
|
||||
the so-called <a href="http://www.google.com/search?q=Schwartzian+Transform&ie=UTF-8&oe=UTF-8">Schwartzian
|
||||
Transform</a>.</p>
|
||||
|
||||
<p>
|
||||
Suppose, for example, you have a list of tuples that you want to sort by the
|
||||
n-th field of each tuple. The following function will do that.</p>
|
||||
|
||||
<pre>def sortby(somelist, n):
|
||||
nlist = [(x[n], x) for x in somelist]
|
||||
nlist.sort()
|
||||
return [val for (key, val) in nlist]
|
||||
</pre>
|
||||
|
||||
<p>Matching the behavior of the current list sort method (sorting in place)
|
||||
is easily achieved as well:</p>
|
||||
|
||||
<pre>def sortby_inplace(somelist, n):
|
||||
somelist[:] = [(x[n], x) for x in somelist]
|
||||
somelist.sort()
|
||||
somelist[:] = [val for (key, val) in somelist]
|
||||
return
|
||||
</pre>
|
||||
|
||||
<p>Here's an example use:</p>
|
||||
|
||||
<pre>>>> somelist = [(1, 2, 'def'), (2, -4, 'ghi'), (3, 6, 'abc')]
|
||||
>>> somelist.sort()
|
||||
>>> somelist
|
||||
[(1, 2, 'def'), (2, -4, 'ghi'), (3, 6, 'abc')]
|
||||
>>> nlist = sortby(somelist, 2)
|
||||
>>> sortby_inplace(somelist, 2)
|
||||
>>> nlist == somelist
|
||||
True
|
||||
>>> nlist = sortby(somelist, 1)
|
||||
>>> sortby_inplace(somelist, 1)
|
||||
>>> nlist == somelist
|
||||
True
|
||||
</pre>
|
||||
|
||||
<h2><a name="stringcat">String Concatenation</a></h2>
|
||||
|
||||
<p> Strings in Python are immutable. This fact frequently sneaks up and
|
||||
bites novice Python programmers on the rump. Immutability confers some
|
||||
advantages and disadvantages. In the plus column, strings can be used a
|
||||
keys in dictionaries and individual copies can be shared among multiple
|
||||
variable bindings. (Python automatically shares one- and two-character
|
||||
strings.) In the minus column, you can't say something like, "change all
|
||||
the 'a's to 'b's" in any given string. Instead, you have to create a new
|
||||
string with the desired properties. This continual copying can lead to
|
||||
significant inefficiencies in Python programs.</p>
|
||||
|
||||
<p>Avoid this:</p>
|
||||
|
||||
<pre>s = ""
|
||||
for substring in list:
|
||||
s += substring
|
||||
</pre>
|
||||
|
||||
<p> Use <code>s = "".join(list)</code> instead. The former is a very common
|
||||
and catastrophic mistake when building large strings. Similarly, if you are
|
||||
generating bits of a string sequentially instead of:</p>
|
||||
|
||||
<pre>s = ""
|
||||
for x list:
|
||||
s += some_function(x)
|
||||
</pre>
|
||||
|
||||
<p>use</p>
|
||||
|
||||
<pre>slist = [some_function(elt) for elt in somelist]
|
||||
s = "".join(slist)
|
||||
</pre>
|
||||
|
||||
<p>Avoid:</p>
|
||||
|
||||
<pre>out = "<html>" + head + prologue + query + tail + "</html>"
|
||||
</pre>
|
||||
|
||||
<p>Instead, use</p>
|
||||
|
||||
<pre>out = "<html>%s%s%s%s</html>" % (head, prologue, query, tail)
|
||||
</pre>
|
||||
|
||||
<p>Even better, for readability (this has nothing to do with efficiency
|
||||
other than yours as a programmer), use dictionary substitution:</p>
|
||||
|
||||
<pre>out = "<html>%(head)s%(prologue)s%(query)s%(tail)s</html>" % locals()
|
||||
</pre>
|
||||
|
||||
<p> This last two are going to be much faster, especially when piled up over
|
||||
many CGI script executions, and easier to modify to boot. In addition, the
|
||||
slow way of doing things got slower in Python 2.0 with the addition of rich
|
||||
comparisons to the language. It now takes the Python virtual machine a lot
|
||||
longer to figure out how to concatenate two strings. (Don't forget that
|
||||
Python does all method lookup at runtime.)</p>
|
||||
|
||||
<h2><a name="loops">Loops</a></h2>
|
||||
|
||||
<p> Python supports a couple of looping constructs. The <code>for</code>
|
||||
statement is most commonly used. It loops over the elements of a sequence,
|
||||
assigning each to the loop variable. If the body of your loop is simple,
|
||||
the interpreter overhead of the <code>for</code> loop itself can be a
|
||||
substantial amount of the overhead. This is where the <code><a href="http://www.python.org/doc/lib/built-in-funcs.html">map</a></code>
|
||||
function is handy. You can think of <code>map</code> as a <code>for</code>
|
||||
moved into C code. The only restriction is that the "loop body" of
|
||||
<code>map</code> must be a function call.</p>
|
||||
|
||||
<p> Here's a straightforward example. Instead of looping over a list of
|
||||
words and converting them to upper case:</p>
|
||||
|
||||
<pre>newlist = []
|
||||
for word in oldlist:
|
||||
newlist.append(word.upper())
|
||||
</pre>
|
||||
|
||||
<p>you can use <code>map</code> to push the loop from the interpreter into
|
||||
compiled C code:</p>
|
||||
|
||||
<pre>newlist = map(str.upper, oldlist)
|
||||
</pre>
|
||||
|
||||
<p> List comprehensions were added to Python in version 2.0 as well. They
|
||||
provide a syntactically more compact way of writing the above for loop:</p>
|
||||
|
||||
<pre>newlist = [s.upper() for s in list]
|
||||
</pre>
|
||||
|
||||
<p>It's generally not any faster than the for loop version, however.</p>
|
||||
|
||||
<p> Guido van Rossum wrote a much more detailed <a href="http://www.python.org/doc/essays/list2str.html">examination of loop
|
||||
optimization</a> that is definitely worth reading.</p>
|
||||
|
||||
<h2><a name="dots">Avoiding dots...</a></h2>
|
||||
|
||||
<p>Suppose you can't use <code>map</code> or a list comprehension? You may
|
||||
be stuck with the for loop. The for loop example has another inefficiency.
|
||||
Both <code>newlist.append</code> and <code>word.upper</code> are function
|
||||
references that are reevaluated each time through the loop. The original
|
||||
loop can be replaced with:</p>
|
||||
|
||||
<pre>upper = str.upper
|
||||
newlist = []
|
||||
append = newlist.append
|
||||
for word in list:
|
||||
append(upper(word))
|
||||
</pre>
|
||||
|
||||
<p>This technique should be used with caution. It gets more difficult to
|
||||
maintain if the loop is large. Unless you are intimately familiar with that
|
||||
piece of code you will find yourself scanning up to check the definitions of
|
||||
<code>append</code> and <code>upper</code>.</p>
|
||||
|
||||
<h2><a name="local">Local Variables</a></h2>
|
||||
|
||||
<p> The final speedup available to us for the non-<code>map</code> version
|
||||
of the <code>for</code> loop is to use local variables wherever possible.
|
||||
If the above loop is cast as a function, append<code></code> and
|
||||
<code>upper</code> become local variables. Python accesses local variables
|
||||
much more efficiently than global variables.</p>
|
||||
|
||||
<pre>def func():
|
||||
upper = str.upper
|
||||
newlist = []
|
||||
append = newlist.append
|
||||
for word in words:
|
||||
append(upper(word))
|
||||
return newlist
|
||||
</pre>
|
||||
|
||||
<p>At the time I originally wrote this I was using a 100MHz Pentium running
|
||||
BSDI. I got the following times for converting the list of words in
|
||||
<code>/usr/share/dict/words</code> (38,470 words at that time) to upper
|
||||
case:</p>
|
||||
|
||||
<table>
|
||||
<tbody><tr>
|
||||
<th align="left">Version</th>
|
||||
<th>Time (seconds)</th>
|
||||
</tr>
|
||||
|
||||
<tr>
|
||||
<td>Basic loop</td><td align="right">3.47</td>
|
||||
</tr>
|
||||
|
||||
<tr>
|
||||
<td>Eliminate dots</td><td align="right">2.45</td>
|
||||
</tr>
|
||||
|
||||
<tr>
|
||||
<td>Local variable & no dots</td><td align="right">1.79</td>
|
||||
</tr>
|
||||
|
||||
<tr>
|
||||
<td>Using <code>map</code> function</td><td align="right">0.54</td>
|
||||
</tr>
|
||||
</tbody></table>
|
||||
|
||||
<p> Eliminating the loop overhead by using <code>map</code> is often going
|
||||
to be the most efficient option. When the complexity of your loop precludes
|
||||
its use other techniques are available to speed up your loops, however.</p>
|
||||
|
||||
<h2><a name="initdict">Initializing Dictionary Elements</a></h2>
|
||||
|
||||
<p> Suppose you are building a dictionary of word frequencies and you've
|
||||
already broken your text up into a list of words. You might execute
|
||||
something like:</p>
|
||||
|
||||
<pre>wdict = {}
|
||||
has_key = wdict.has_key
|
||||
for word in words:
|
||||
if not has_key(word): wdict[word] = 0
|
||||
wdict[word] = wdict[word] + 1
|
||||
</pre>
|
||||
|
||||
<p> Except for the first time, each time a word is seen the <code>if</code>
|
||||
statement's test fails. If you are counting a large number of words, many
|
||||
will probably occur multiple times. In a situation where the initialization
|
||||
of a value is only going to occur once and the augmentation of that value
|
||||
will occur many times it is cheaper to use a <code>try</code> statement:</p>
|
||||
|
||||
<pre>wdict = {}
|
||||
for word in words:
|
||||
try:
|
||||
wdict[word] += 1
|
||||
except KeyError:
|
||||
wdict[word] = 1
|
||||
</pre>
|
||||
|
||||
<p> It's important to catch the expected KeyError exception, and not have a
|
||||
default <code>except</code> clause to avoid trying to recover from an
|
||||
exception you really can't handle by the statement(s) in the
|
||||
<code>try</code> clause.</p>
|
||||
|
||||
<p> A third alternative became available with the release of Python 2.x.
|
||||
Dictionaries now have a get() method which will return a default value if
|
||||
the desired key isn't found in the dictionary. This simplifies the loop:</p>
|
||||
|
||||
<pre>wdict = {}
|
||||
for word in words:
|
||||
wdict[word] = wdict.get(word, 0) + 1
|
||||
</pre>
|
||||
|
||||
<p> When I originally wrote this section, there were clear situations where
|
||||
one of the first two approaches was faster. It seems that all three
|
||||
approaches now exhibit similar performance (within about 10% of each other),
|
||||
more or less independent of the properties of the list of words.</p>
|
||||
|
||||
<h2><a name="import">Import Statement Overhead</a></h2>
|
||||
|
||||
<p> <code>import</code> statements can be executed just about anywhere.
|
||||
It's often useful to place them inside functions to restrict their
|
||||
visibility and/or reduce initial startup time. Although Python's
|
||||
interpreter is optimized to not import the same module multiple times,
|
||||
repeatedly executing an import statement can seriously affect performance in
|
||||
some circumstances.</p>
|
||||
|
||||
<p> Consider the following two snippets of code (originally from Greg
|
||||
McFarlane, I believe - I found it unattributed in a <a href="news:comp.lang.python">comp.lang.python</a>/<a href="mailto:python-list@python.org">python-list@python.org</a> posting and
|
||||
later attributed to him in another source):</p>
|
||||
|
||||
<pre>def doit1():
|
||||
import string ###### import statement inside function
|
||||
string.lower('Python')
|
||||
|
||||
for num in range(100000):
|
||||
doit1()
|
||||
</pre>
|
||||
|
||||
<p>or:</p>
|
||||
|
||||
<pre>import string ###### import statement outside function
|
||||
def doit2():
|
||||
string.lower('Python')
|
||||
|
||||
for num in range(100000):
|
||||
doit2()
|
||||
</pre>
|
||||
|
||||
<p><code>doit2</code> will run much faster than <code>doit1</code>, even
|
||||
though the reference to the string module is global in <code>doit2</code>.
|
||||
Here's a Python interpreter session run using Python 2.3 and the new
|
||||
<code>timeit</code> module, which shows how much faster the second is than
|
||||
the first:</p>
|
||||
|
||||
<pre>>>> def doit1():
|
||||
... import string
|
||||
... string.lower('Python')
|
||||
...
|
||||
>>> import string
|
||||
>>> def doit2():
|
||||
... string.lower('Python')
|
||||
...
|
||||
>>> import timeit
|
||||
>>> t = timeit.Timer(setup='from __main__ import doit1', stmt='doit1()')
|
||||
>>> t.timeit()
|
||||
11.479144930839539
|
||||
>>> t = timeit.Timer(setup='from __main__ import doit2', stmt='doit2()')
|
||||
>>> t.timeit()
|
||||
4.6661689281463623
|
||||
</pre>
|
||||
|
||||
<p>String methods were introduced to the language in Python 2.0. These
|
||||
provide a version that avoids the import completely and runs even
|
||||
faster:</p>
|
||||
|
||||
<pre>def doit3():
|
||||
'Python'.lower()
|
||||
|
||||
for num in range(100000):
|
||||
doit3()
|
||||
</pre>
|
||||
|
||||
<p>Here's the proof from <code>timeit</code>:</p>
|
||||
|
||||
<pre>>>> def doit3():
|
||||
... 'Python'.lower()
|
||||
...
|
||||
>>> t = timeit.Timer(setup='from __main__ import doit3', stmt='doit3()')
|
||||
>>> t.timeit()
|
||||
2.5606080293655396
|
||||
</pre>
|
||||
|
||||
<p> The above example is obviously a bit contrived, but the general
|
||||
principle holds.</p>
|
||||
|
||||
<h2><a name="setgetdel">Using <code>map</code> with Dictionaries</a></h2>
|
||||
|
||||
<p> I found it frustrating that to use <code><a href="http://www.python.org/doc/current/lib/built-in-funcs.html#l2h-47">map</a></code>
|
||||
to eliminate simple <code>for</code> loops like:</p>
|
||||
|
||||
<pre>dict = {}
|
||||
nil = []
|
||||
for s in list:
|
||||
dict[s] = nil
|
||||
</pre>
|
||||
|
||||
<p>I had to use a <code><a href="http://www.python.org/doc/current/ref/lambda.html">lambda</a></code>
|
||||
form or define a <a href="http://www.python.org/doc/current/tut/node6.html">named function</a>
|
||||
that would probably negate any speedup I was getting by using
|
||||
<code>map</code> in the first place. I decided I needed some functions to
|
||||
allow me to set, get or delete dictionary keys and values en masse. I
|
||||
proposed a change to Python's dictionary object and used it for awhile.
|
||||
However, a more general solution appears in the form of the
|
||||
<code>operator</code> module in Python 1.4. Suppose you have a list and you
|
||||
want to eliminate its duplicates (ignoring the presence of set objects new
|
||||
in Python 2.3). Instead of the code above, you can execute:</p>
|
||||
|
||||
<pre>dict = {}
|
||||
map(operator.setitem, [dict]*len(list), list, [])
|
||||
list = statedict.keys()
|
||||
</pre>
|
||||
|
||||
This moves the for loop into C where it executes much faster.
|
||||
|
||||
<h2><a name="aggregate">Data Aggregation</a></h2>
|
||||
|
||||
<p> Function call overhead in Python is relatively high, especially compared
|
||||
with the execution speed of a builtin function. This strongly suggests that
|
||||
where appropriate, functions should handle data aggregates. Here's a
|
||||
contrived example written in Python.</p>
|
||||
|
||||
<pre>import time
|
||||
x = 0
|
||||
def doit1(i):
|
||||
global x
|
||||
x = x + i
|
||||
|
||||
list = range(100000)
|
||||
t = time.time()
|
||||
for i in list:
|
||||
doit1(i)
|
||||
|
||||
print "%.3f" % (time.time()-t)
|
||||
</pre>
|
||||
|
||||
<p>vs.</p>
|
||||
|
||||
<pre>import time
|
||||
x = 0
|
||||
def doit2(list):
|
||||
global x
|
||||
for i in list:
|
||||
x = x + i
|
||||
|
||||
list = range(100000)
|
||||
t = time.time()
|
||||
doit2(list)
|
||||
print "%.3f" % (time.time()-t)
|
||||
</pre>
|
||||
|
||||
<p>Here's the proof in the pudding using an interactive session:</p>
|
||||
|
||||
<pre>>>> t = time.time()
|
||||
>>> doit2(list)
|
||||
>>> print "%.3f" % (time.time()-t)
|
||||
0.204
|
||||
>>> t = time.time()
|
||||
>>> for i in list:
|
||||
... doit1(i)
|
||||
...
|
||||
>>> print "%.3f" % (time.time()-t)
|
||||
0.758
|
||||
</pre>
|
||||
|
||||
<p>Even written in Python, the second example runs about four times faster
|
||||
than the first. Had <code>doit</code> been written in C the difference
|
||||
would likely have been even greater (exchanging a Python <code>for</code>
|
||||
loop for a C <code>for</code> loop as well as removing most of the function
|
||||
calls).</p>
|
||||
|
||||
<h2><a name="periodic">Doing Stuff Less Often</a></h2>
|
||||
|
||||
<p> The Python interpreter performs some periodic checks. In particular, it
|
||||
decides whether or not to let another thread run and whether or not to run a
|
||||
pending call (typically a call established by a signal handler). Most of the
|
||||
time there's nothing to do, so performing these checks each pass around the
|
||||
interpreter loop can slow things down. There is a function in the
|
||||
<code>sys</code> module, <code>setcheckinterval</code>, which you can call
|
||||
to tell the interpreter how often to perform these periodic checks. Prior
|
||||
to the release of Python 2.3 it defaulted to 10. In 2.3 this was raised to
|
||||
100. If you aren't running with threads and you don't expect to be catching
|
||||
many signals, setting this to a larger value can improve the interpreter's
|
||||
performance, sometimes substantially.</p>
|
||||
|
||||
<h2><a name="notc">Python is not C</a></h2>
|
||||
|
||||
<p>It is also not Perl, Java, C++ or Haskell. Be careful when transferring
|
||||
your knowledge of how other languages perform to Python. A simple example
|
||||
serves to demonstrate:</p>
|
||||
|
||||
<pre> % timeit.py -s 'x = 47' 'x * 2'
|
||||
1000000 loops, best of 3: 0.574 usec per loop
|
||||
% timeit.py -s 'x = 47' 'x << 1'
|
||||
1000000 loops, best of 3: 0.524 usec per loop
|
||||
% timeit.py -s 'x = 47' 'x + x'
|
||||
1000000 loops, best of 3: 0.382 usec per loop
|
||||
</pre>
|
||||
|
||||
<p>Now consider the similar C programs (only the add version is shown):</p>
|
||||
|
||||
<pre>#include <stdio.h>
|
||||
|
||||
int
|
||||
main (int argc, char **argv) {
|
||||
int i = 47;
|
||||
int loop;
|
||||
for (loop=0; loop<500000000; loop++)
|
||||
i + i;
|
||||
}
|
||||
</pre>
|
||||
|
||||
<p>and the execution times:</p>
|
||||
|
||||
<pre> % for prog in mult add shift ; do
|
||||
< for i in 1 2 3 ; do
|
||||
< echo -n "$prog: "
|
||||
< /usr/bin/time ./$prog
|
||||
< done
|
||||
< echo
|
||||
< done
|
||||
mult: 6.12 real 5.64 user 0.01 sys
|
||||
mult: 6.08 real 5.50 user 0.04 sys
|
||||
mult: 6.10 real 5.45 user 0.03 sys
|
||||
|
||||
add: 6.07 real 5.54 user 0.00 sys
|
||||
add: 6.08 real 5.60 user 0.00 sys
|
||||
add: 6.07 real 5.58 user 0.01 sys
|
||||
|
||||
shift: 6.09 real 5.55 user 0.01 sys
|
||||
shift: 6.10 real 5.62 user 0.01 sys
|
||||
shift: 6.06 real 5.50 user 0.01 sys
|
||||
</pre>
|
||||
|
||||
<p>Note that there is a significant advantage in Python to adding a number
|
||||
to itself instead of multiplying it by two or shifting it left by one bit.
|
||||
In C on all modern computer architectures, each of the three arithmetic
|
||||
operations are translated into a single machine instruction which executes
|
||||
in one cycle, so it doesn't really matter which one you choose.</p>
|
||||
|
||||
<p>A common "test" new Python programmers often perform is to translate the
|
||||
common Perl idiom</p>
|
||||
|
||||
<pre> while (<>) {
|
||||
print;
|
||||
}
|
||||
</pre>
|
||||
|
||||
<p>into Python code that looks something like</p>
|
||||
|
||||
<pre> #!/usr/bin/env python
|
||||
|
||||
import fileinput
|
||||
|
||||
for line in fileinput.input():
|
||||
print line,
|
||||
</pre>
|
||||
|
||||
<p>and use it to conclude that Python must be much slower than Perl. As others
|
||||
have pointed out numerous times, Python is slower than Perl for some things
|
||||
and faster for others. Relative performance also often depends on your
|
||||
experience with the two languages.</p>
|
||||
|
||||
<hr>
|
||||
<address>
|
||||
Skip Montanaro<br>
|
||||
(<a href="mailto:skip@pobox.com">skip@pobox.com</a>)
|
||||
</address>
|
||||
|
||||
<p>
|
||||
<a href="http://validator.w3.org/check/referer"><img src="http://www.w3.org/Icons/valid-xhtml10" alt="Valid XHTML 1.0!" width="88" height="31"></a>
|
||||
</p>
|
||||
|
||||
<p>
|
||||
<!-- hhmts start -->
|
||||
Last modified: Fri Mar 26 09:08:18 CST 2004
|
||||
<!-- hhmts end -->
|
||||
</p>
|
||||
|
||||
</body></html>
|
||||
44
src/Python/Scripts/python-fud.py
Normal file
44
src/Python/Scripts/python-fud.py
Normal file
@@ -0,0 +1,44 @@
|
||||
#!/usr/bin/python
|
||||
import sys, getopt, ntpath, os
|
||||
|
||||
def xor(source, output, key):
|
||||
contents = bytearray(open(source, "rb").read())
|
||||
for i in range(len(contents)):
|
||||
for j in range(len(key)):
|
||||
contents[i] ^= ord(key[j])
|
||||
|
||||
with open(output, 'wb') as output:
|
||||
output.write(contents)
|
||||
|
||||
def main(filename, argv):
|
||||
inputfile = ''
|
||||
outputfile = ''
|
||||
key = ''
|
||||
|
||||
try:
|
||||
opts, args = getopt.getopt(argv, "hs:o:k:", ["source=", "output=", "key="])
|
||||
except getopt.GetoptError:
|
||||
print(filename + ' -s <source> -o <output> -k <key>')
|
||||
sys.exit(1)
|
||||
|
||||
for opt, arg in opts:
|
||||
if opt == '-h':
|
||||
print(filename + ' -s <source> -o <output> -k <key>')
|
||||
sys.exit()
|
||||
elif opt in ("-s", "--source"):
|
||||
if not os.path.exists(arg):
|
||||
print('The source file does not exist')
|
||||
sys.exit(1)
|
||||
inputfile = arg
|
||||
elif opt in ("-o", "--output"):
|
||||
outputfile = arg
|
||||
elif opt in ("-k", "--key"):
|
||||
key = arg
|
||||
|
||||
if not inputfile or not outputfile or not key:
|
||||
print(filename + ' -s <source> -o <output> -k <key>')
|
||||
sys.exit(1)
|
||||
|
||||
xor(inputfile, outputfile, key)
|
||||
|
||||
if __name__ == "__main__": main(ntpath.basename(sys.argv[0]), sys.argv[1:])
|
||||
71
src/Python/TXTS/Python Cheatsheet of Useful Commands.txt
Executable file
71
src/Python/TXTS/Python Cheatsheet of Useful Commands.txt
Executable file
@@ -0,0 +1,71 @@
|
||||
0.) For creating functions do:
|
||||
def fun():
|
||||
def main():
|
||||
def start(): etc...
|
||||
to pass variables to another function do:
|
||||
def part1(x):
|
||||
print "Hi %d" %x <-- number/digit
|
||||
NOTE: A String can not use %d
|
||||
unless it's a number: 1, 2, etc!
|
||||
print "Hi %s" %x <-- String
|
||||
def main():
|
||||
x=3
|
||||
part1(x) <-- calls part1 function with x being passed to it.
|
||||
NOTE: You can pass many values through.
|
||||
def part1(x,y,z):
|
||||
print "Hi %d %d %d" %(x,y,z) <-- numbers/digits
|
||||
NOTE: Strings can not use %d
|
||||
unless it's a number: 1, 2, etc!
|
||||
print "Hi %s %s %s" %(x,y,z) <-- Strings
|
||||
def main():
|
||||
x=1
|
||||
y=2
|
||||
z=3
|
||||
part1(x,y,z)
|
||||
1.) For intager input do:
|
||||
input("Type your number here fool!!")
|
||||
to tack it on to a variable do:
|
||||
x= input("Type your number here fool!!")
|
||||
2.) For string input do:
|
||||
raw_input("Type your name here fool!!")
|
||||
to tack it on to a variable do:
|
||||
x= raw_input("Type your name here fool!!")
|
||||
3.) To print out in a terminal do:
|
||||
print "Your message in these parenthasies"
|
||||
4.) If statements are as is:
|
||||
if x == 1:
|
||||
print "Blah"
|
||||
OR
|
||||
if x == 1:
|
||||
print "Blah"
|
||||
elif x== 2:
|
||||
print "Blah2"
|
||||
else
|
||||
print "Damn you dumb!!"
|
||||
5.) To do and or or statement do:
|
||||
x=1
|
||||
y=2
|
||||
z=3
|
||||
if x == y and x == z :
|
||||
print "Huh?"
|
||||
OR for or do:
|
||||
x=1
|
||||
y=2
|
||||
z=3
|
||||
if x == y or x == z :
|
||||
print "Huh?"
|
||||
6.) For string comparisons do :
|
||||
x="Fucker"
|
||||
z="Hi"
|
||||
if x != y :
|
||||
print "Yup, correct."
|
||||
7.) To pass variables into print do:
|
||||
x="Fucker"
|
||||
z="Hi"
|
||||
varb=(x,y)
|
||||
if x != y :
|
||||
print "Yup, correct. %S is not the same as %S" % varb
|
||||
elif x == y :
|
||||
print "Yup, correct. %S is the same as %S" % varb
|
||||
NOTE: This passes the variables in order as they appear. The % before varb
|
||||
means pass these two variables, in order of apperance to the %s's
|
||||
Reference in New Issue
Block a user