Submit Your Requirement
Scroll down to discover

PPSS- a handy tool for parallel processing

August 27, 2012Category : Blog

Last Updated on by ravibigapp

We were looking for a light Linux shell tool, that could process given commands using advantage of multicore system (parallel processing). Two of the tools that we came across is worth writing about – parallel and PPSS. This blog discusses PPSS, while parallel can be food for a future blog.

PPSS can be downloaded from https://code.google.com/p/ppss/ (there are deb and rpm files too). PPSS is a shell script that can be used to run any command, script or program in parallel. All it needs is source (file or dir) and a command to execute.

Example-.

./ppss -d -c ” -p

./ppss -f -c ” -p

If source is a directory, then it executes the command on each file in the directory and if source is a file, it executes the command on each line in the file.

Note: command should always be enclosed in single quotes. Argument (each file in source directory or each line in source file) can be accessed in command by the variable – “$ITEM”. At any time, number of items being processed will never increase the cores available, say while processing 50GB of data.

$~/bin/ppss -f list_of_files_to_be_processed.txt -c ‘zgrep “”

../../”$ITEM”‘ -p 2

Jul 26 09:11:22:  =========================================================                                                                                             

Jul 26 09:11:22:                         |P|P|S|S|

Jul 26 09:11:22:  Distributed Parallel Processing Shell Script vers. 2.97                                                                                               

Jul 26 09:11:22:  =========================================================

Jul 26 09:11:22:  Hostname:             domU-12-31-39-00-EC-96                                                                                                          

Jul 26 09:11:22:  ———————————————————

Jul 26 09:11:22:  CPU: Dual-Core AMD Opteron(tm) Processor 2218 HE                                                                                                      

Jul 26 09:11:22:  Starting 2 parallel workers.

Jul 26 09:11:22:  ———————————————————

Jul 26 13:18:18:  70% complete. Processed 10199 of 14400. Failed 158/14400

Jul 26 13:16:33:  ETA: Thu Jul 26 13:11:23 UTC 2012

Output of each command executed on each item is logged in a single file named after the item. The log file is available at location – ‘./ppss_dir/job_log/’. Command execution status for the item can be obtained from log file and it displays as- “Status: FAILURE / Status: Success “

For more information on ppss, follow the links below.

https://code.google.com/p/ppss/

https://code.google.com/p/ppss/downloads/detail?name=ppss-2.85.tgz

Leave a Reply

Your email address will not be published. Required fields are marked *

Get The Latest Updates

© Promptcloud 2009-2020 / All rights reserved.
To top