SharedLogWiki:CodeRewrite
From SharedLogWiki
[edit] Initial Changes to the code
check uploads for viruses.
about virus scan – it may be better to use clamd daemon or clamscan command line instead of php-clamav because I think it may be causing problems for php/http processes.
find the file_type by using specialized check_type function, similar to check_mime but will also use extensions. Possible values are $file_type or false for not supported
this function will also take on the functionality of is_supported
Possible types are:
image
archive
swf
document
video
false
The do case... and pass file to corresponding functions
possible functions are:
process_image (all supported image types. This is the one of the biggers functions since it does checking image is_corrupted, resize, watermark, exif extraction, creating thumb for albom)
process_archive (zip, tar, gz, bz2, rar) // will unpack archive and create an array of file names/temp paths
process_swf // record swf into resource, record into file system (originals is fine), create html for it and record html into RESOURCE.description
process_text_document (for txt, word, wpd, html, odt etc...)
process_video // too far in the future, will discuss later.
We will also add more common functions to a separate file like !class.Main_99.inc.php or !library.system.inc.php
Some not used functions should also be removed from there. These old functions are there from other old projects and are not used in sharedlog project.
It will include functions for checking file types, deleting temporary files and dirs recursively, and many other common functions.
Process_image function will
save original, resize for work_size,
create thumb, extract EXIF data if it's jpeg or tiff, convert .bmp
and jp2 and tiff into jpeg's. Record title and description into
RESOURCE and ALBUM using username of owner.
We have all these
functions already available somewhere, they just need to be
repackaged and put into one class or even into one function.
Process_archive will
determine the type of archive. It can be zip, tar/gz/bz2 or rar. Depending of the type of archive, different functions will be used to unpack it. After the unpacking we should have an array file names, types, paths to files and possibly album_name. We will then pass each file to appropriate processing function, like if it's an image it will be passed to process_image, etc...
temporary files deleted at the end.
Process_swf will read the .swf file using swfdump tool and create html file for embedding the swf into it. The HTML file will then be recorded into RESOURCE.description and swf into /originals/ or in some special folder like /swf/
Process_document will process all docs like MS Word, WordPerfect, OpenOffice, HTML, text, AbiWord, basically what UploadFile.php by Aysha is doing now.
Process_video will eventually deal with uploaded videos, generally it will use some encoder tool to convert avi or mpeg into .flv or .rv streaming format.
We will still be using some of the existing classes like error handline, notify_developer etc...
So basically it will be mostly re-packaging of the classes with some changes whenever necessary and some new functions will be added.
Tasks by priority:
0) check uploaded files for
max_size (in bytes), virus check. Notify developer and uploader (if
not anonymous) if there are checks are positive and delete temp
files. If negative, pass to process_file($file_to_parse)
Where
$file_to_parse will be array of path, title, album_id,
user_id
process_file() will include list of
functions:
find_type();
then case type = function
function
should return array of paths to temporary files, so they could be
deleted at the end.
create function find_type
for this function we will need to create new arrays in config-cfg.inc.php
Array 1: ext => type
Array 2: type => array('mime1', 'mime2', 'mime2'); //possible mime types for one file Type, like
rar can be application/x-rar and can also be application/octet-stream
Maybe it's better the other way mime => array(type1, type2, type3)
Another array will have file_type => function
Example application/msword => parse_document,
application/wordperfect => parse_document,
image/jpeg => parse_image,
image/gif => parse_imageMaybe better to have several arrays like:
IMAGES(jpeg, gif, tiff, bmp, png);
DOCUMENTS(word, wpd, odt, html)
FLASH(swf)
VIDEOS(flv, mpeg, avi)
ARCHIVES(zip, tar, gz, bz2, rar)
If uploaded file does not match any of supported types, send email to user, email to admin.
These arrays could then be loaded into memcache for extra fast parsing. 1) find array of type(s) using mime. Find type by extension. Do array_intersect or array_search to find common result (only one) for mime and extension.
create function parse_image(album_id, title, userid, type) using existing functions and classes for doing checking is_corrupted, image_dimensions, resizing to work size + adding watermark, creating thumb, extracting EXIF if jpeg or tiff, converting non-web formats to web formats. Use jpegtran to resize jpegs, use MW to resize other types.
Check if user options have smart resize selected, if yes, then shave image edges using MW function shave.
function do_album_thumb to create thumbnail of album using first 4 images from album. Blur images first when album changes to private. Redo/change album thumb after every change to image in album only if changed image is one of the first 4 or if album is changed from public to private or back.
Use the type of album_thumb that user has selected in options. Currently 3 possible options: folder, stack, polaroid.
parse_text_document
if file_type is in DOCUMENTS array then parse document.
All these will first be done for processing ONE uploaded file. Since we can receive up to 10 files, we will have to do a loop to pass each file to these functions. Multiple uploaded files can have the same albumID, same category and subcategory and of cause the same userID
This means that we must hold the uploaded data in array like
$UPLOAD['SESSION_ID']['userID']
$UPLOAD['SESSION_ID']['albumID']
$UPLOAD['SESSION_ID']['categoryID']
$UPLOAD['SESSION_ID']['FILES']['0']['title']
$UPLOAD['SESSION_ID']['FILES']['0']['tmp_path']
But maybe more convenient to rearrange this array like
$UPLOAD['SESSION_ID']['0']['title']
$UPLOAD['SESSION_ID']['0']['category_id']
$UPLOAD['SESSION_ID']['0']['tmp_path']
this way it will have some duplicate data in array because userID, categoryID and albumID are the same for all files, but at least it will be easier to work with this array during the foreach loop.
we will then loop through all files and process one by one as if it was the only uploaded file.
We may (or may not) need to first move
uploaded files away from php file upload dir to our own temp dir so
that php will not accidentally overwrite or delete these files before
we finish parsing them.
This is just a good common-sense practice
to do this.
PS. We probably don't even need 'SESSION_ID' key at all, just $UPLOAD['0']['tmp_path'] and so on..
