Before a few days I had the idea that it would be really nice if we could have a database of greek theses and dissertations, about mathematics and science in general. From what i know, there are a few databases around, mostly for Electrical Engineering and Computer Science dissertations. So, I thought that it would be a good opportunity for me to exercise my CGI and Python Web scripting skills.
And here I am, presenting the Thesis Database project. I hope that it will be useful and people will contribute.

But while coding the CGI backbone, I had a few problems to solve, so since i had to come up with the solutions (couldn’t find anything useful online), i decided to post them here, for future reference. I will also give the basic idea of how a python cgi uploading script works. The focus is on security of the code.
So, the following is a very simple html form, which will be used as the user interface for the upload.
We suppose that the cgi script is called upload.py and is placed inside the $Web root$/cgi-bin/ directory.
upload.html
<html>
<head> <title>Upload Example</title> </head>
<body> <div align=”center”>
<form action=”/cgi-bin/upload.py” method=”POST” enctype=”multipart/form-data”>
File : <input name=”file” type=”file” size=”35″><BR>
<P><input name=”submit” type=”submit” value=”Upload”></div> </body> </html>
upload.py
import cgi, sys, os
try :
form = cgi.FieldStorage() #parse query
fileitem = form["file"]
filename=os.path.join(‘/tmp’,os.path.basename(fileitem.filename)) # We suppose we save it in the /tmp directory (should set the right permissions in the folder) and we manipulate the filename so that we avoid directory traversal attacks.
if os.access(filename,0) : # Check if the file already exists. Important in order to secure our database.
print ‘Filename already exists’
sys.exit()
fout = file(filename, ‘wb’) # We write the file, in parts of 100 Kb. This is used to make sure we can handle larger file, without using all the memory and crashing the system.
for i in range(103): # This is used to set the upload file size limit. 103 is for about 10 MB upload limit.
part = fileitem.file.read(100000)
if not part: break
fout.write (part)
if fileitem.file.read(100000): # If the file is larger than the limit, we remove the already uploaded part of the file.print ‘The file you tried to upload is too large.’
os.remove(filename)
sys.exit()fout.close()
print ‘You have successfully uploaded the file.’except: print ‘A problem occurred. Please try again.’
Comments :
1) The important (non trivial) parts of this code, is the file size checking and handling and the checking for the existence of the file before editing/deleting it.
2) Whenever you see print above, in the string you should include html code, which i didn’t do in order to make the code presented here more compact. If you don’t do this, the chances that your program will work with an older browser is much less.
3) It is a good idea to use a general output html template, which will be edited each time you print something.
4) When coding it is very useful to use the following :
import cgitb
cgitb.enable()
This will print the error output in your browser, so you will know what is wrong with your problem, but you should not include it in the final version, since it can give out many of your code’s details.