Monday, November 12, 2012

Print all the permutations of the characters in a word

A interesting computer programming problem is to print the all permutations of a the characters in a word.

How can we do that? Below is a simple approach:

Suppose your word has 'n' chars,
then first letter of your word can have n options,
second n -1 options
third n -2 options
and so on....
last will have just one option.
Trick is to add the options successively to the prefix, and sending
the suffix (i.e.) the options for remaining positions to function
and call the function for printing permutations recursively.


Below is the code for printing permutations of a word. Here I am
not taking into account the case of repeated letters. Which results
in printing duplicates. More thoughts for avoiding printing duplicates
efficiently:). I don't want to use a set to store the words and printing them
at the end. I want an algorithmic way out to solve the issue. Also, in this example the maximum word size is taken as 26, but that can be easily changed; just replace 26 with a bigger number:)




// Below is the code for printing permutations of an word. Here I am
// not taking into account the case of repeated letters. Which results
// in printing duplicates. More thoughts for avoiding printing duplicates
// efficiently. I don't want to use a store the words and printing them
// at them at the end. I want an algorithmic way out to solve the issue
//
// How does it work ?
// Suppose your whord has 'n' chars,
// then first letter of your word can have n options,
// second n -1 options
// third n -2 options
// and so on....
// last will have just one option.
// Trick is to add the options successively to the prefix, and sending
// the suffix (i.e.) the options for remaining positions to function
// and call the function print_permutations_word repeatedly
//

#include <stdio.h>
#include <string.h>

int print_permutations_word(const char *prefix, const char *suffix, int suffixlen)
{
    char myprefix[26] = "";
    char mysuffix[26] = "";
    int i = 0, j = 0, k = 0;
    if (suffixlen == 0) {
        printf("word %s\n", prefix);
        return 0;
    }

    while (i < suffixlen) {
        memset(myprefix, 0, 26);
        memset(mysuffix, 0, 26);
        snprintf(myprefix,26,"%s%c", prefix,suffix[i]);
        j = 0;
        k = 0;
        while (j < suffixlen) {
           if (i != j){
                mysuffix[k++] = suffix[j];
           }
           j++;
        }
        i++;
        print_permutations_word(myprefix, mysuffix, suffixlen - 1);
    }
    return 0;

}

#if 0
//example run
int main()
{
print_permutations_word("", "abcde", 5);
return 0;
}
#endif





Next approach describes how we can print all permutations of the characters using iterative method based on the principle of getting the next permutation. The advantage of this approach is that we don't print duplicates and efficient for longer strings. We start with "smallest" word and successively print the next bigger word stopping at the "biggest" word. Heart of this approach is in getting the next permutation based on the current permutation.

# Given a word(current permutation), how do we find the next permutation?
# ------------------------------
# Start from last char and successively compare a char to the character in its
# left. If the left character is smaller than the current character (say the
# position of the left character is exchange position), exchange the left
# charecter with a charecter to its right which is bigger than it and
# nearest to it. When the exchange happens, sort the character array to the
# right of the exchange position.
# Repeat the above process until the iteration when no exchange of characters
# happens (i.e. it reaches the "biggest" word)

Below example describes the approach:

Current permutation “aerqpdcb”, it is represented in the array below:



a e r q p d c b


Start with rightmost char (b) and proced left to find the first character which is smaller than its right character.

So, we reached 'e' (exchange position is 1).
On the right side of e, we find p is the character which is bigger than e and nearerst to it.

Exchange, p and e and the array now becomes:


a p r q e d c b


Now, sort the array to the right of exchange position and the array becomes:


a p b c d e q r


So, next permutation after “aerqpdcb” is “apbcdeqr”


Below is the Python program (also available at github )  that
implements the approach:

# Below function demonstrates how we can print the permutations of the letters
# in a word

###############################################################################
# HOW it works?
# it starts with the "smallest" possible word and successively prints the next
# bigger word. For example, the smallest possible word from the chars in 
# word 'axprq' is 'apqrx' and then the next bigger word is 'apqxr' and so on.
# It stops after printing the "biggest" word which is 'xrqpa'.
#
# The advantage of this algorithm is that we don't print any duplicate words.
#
#
# Given a word(current permutation), how do we find the next permutation?
# ------------------------------ 
# Start from last char and successively compare a char to the character in its
# left. If the left character is smaller than the current character (say the 
# position of the left character is exchange position), exchange the left 
# charecter with a charecter to its right which is bigger than it and
# nearest to it. When the exchange happens, sort the character array to the 
# right of the exchange position.
# Repeat the above process until the iteration when no exchange of characters
# happens (i.e. it reaches the "biggest" word)

def print_permute(word):
    a = sorted(word)
    l = len(word)
    exchanged = True
    while exchanged:
        print ''.join(a)
        i = l - 1
        exchanged = False
        while i != 0:
            if a[i] <= a[i - 1]:
                i -= 1
            else:
                exchanged = True
                j = i + 1
                while j < l:
                    if a[i - 1] < a[j]:
                        j += 1 
                    else:
                        break
                j -= 1
                a[i-1], a[j] = a[j], a[i-1]
                a[i:] = sorted(a[i:])
                break

#Example Run
print ('Printing permutations of xyz')
print_permute('xyz')
print('.............................')
print('.............................')
print ('Printing permutations of abcdd')
print_permute('abcdd')
print('.............................')
print('.............................')
print ('Printing permutations of axprq')
print_permute('axprq')

Monday, November 5, 2012

Apache Thrift File based transport

Today I will demonstrate how we can use Apache Thrift FileTransport. Thrift file based transport is useful when we want to store the messages in some local files if the remote server is down. We may replay the files when the remote server comes up again. If we want to implement a Thrift based messaging system, then also this feature will be useful for "store and forwarding message" approach.

exception BadOperation {
1: i32 what,
2: optional string reason
}
enum Operation {
CREATE,
DELETE,
FILECONTENT,
DIRCONTENT
}
struct Work {
1: Operation op
2: string filename
3: string data
4: string rootdir
}
service FileService {
i32 createFile(1:Work w) throws (1:BadOperation badop),
list&lt;string> getFiles(1:Work w) throws (1:BadOperation badop)
}

Please refer my previous post or google for how to use the thrift compiler to generate code for handling thrift messages following the above definitions.My example will be in C++, but a PHP, Java or Python example would look very much similar.The generated code has a file FileService.cpp. Examining the code will show that createFileinterface calls two member functions of FileServiceClient class. They aresend_createFile and recv_createFile. send_createFile actually serialize our data,and send them over the socket in case of a typical client-server scenario.recv_createFile processes the response back from the server. In case we are using File based transport to "send" data, then there won't be any response. Hence, we don't need to call recv_createFile function. In stead we just call send_createFile to "send" or write the messages to the file. This is the trick.Same is the case with writing "getFiles" messages to the file as well.Actually we can make this much more efficient by serializing and batching the writes to the file ourselves. But here I am not showing that, as my purpose is to demonstrate the basic operations of File based transport.
Below I have pasted my code for "writing" messages to the file (file_writer.cpp)
#include <stdlib.h>
#include <time.h>  
#include <iostream>
#include <sstream> 
#include <protocol/TBinaryProtocol.h>
#include <transport/TSocket.h>       
#include <transport/TTransportUtils.h>

#include "FileService.h"

using std::cout;
using std::endl;
using std::stringstream;
using namespace apache::thrift;
using namespace apache::thrift::protocol;
using namespace apache::thrift::transport;

using namespace FileHandler;

using namespace boost;

int main(int argc, char** argv) {
    shared_ptr<TTransport> file(new TFileTransport("testfiletransport"));
    shared_ptr<TTransport> transport(new TBufferedTransport(file));      
    shared_ptr<TProtocol> protocol(new TBinaryProtocol(transport));      
    FileServiceClient client(protocol);                                  
    try {                                                                
        int i = 0;                                                       
        stringstream ss (stringstream::in | stringstream::out);          
        Work work;
        work.data = "mydata";
        work.rootdir = "/home/nipun/test";

        try {
            while (i++ < 100) {
                work.op = Operation::CREATE;
                ss << "filename" << i ;
                work.filename = ss.str();
                ss.clear();
                ss.str("");
                ss << "data" << rand() << "-" << rand() << time(0);
                work.data = ss.str();
                ss.clear();
                ss.str("");
                client.send_createFile(work);

                work.op = Operation::DIRCONTENT;
                ss << "filename" << i  << rand();
                work.filename = ss.str();
                ss.clear();
                ss.str("");
                ss << "data" << rand() << "-" << rand() << time(0) << rand();
                work.data = ss.str();
                ss.clear();
                ss.str("");
                client.send_getFiles(work);
            }
        } catch (BadOperation &op) {
            cout << "Exception "  <<  op.reason  << endl;
        }

    } catch (TException &tx) {
        cout <<  "ERROR: " << tx.what() << endl;
    }
}


Here we wrote "createFile" and "getFiles" messages 100 times each to the file "testfiletransport" in current directory. We used send_createFiles and send_getFiles routines for the same. Remember that we have to use exact similar type of transport while reading from the file also. TBinary protocol specifies how the data is serialized and TBuffered transport is used to buffer data before they are flushed to underlying transport which is the file "testfiletransport" here.

Now is the time to read the data from the file. Generally we write the data to a file and read it later when want to replay them later. In such cases the messages are read and send over another transport which may be over TSocket (socket based transport class defined in Thrift library). But in this example we will just read the messages and print them to standard out.
 

Code for file_reader.cpp

#include <stdlib.h>
#include <time.h>  
#include <iostream>
#include <sstream> 
#include <string>  
#include <protocol/TBinaryProtocol.h>
#include <transport/TSocket.h>       
#include <transport/TTransportUtils.h>

#include "FileService.h"

using std::cout;
using std::endl;
using std::stringstream;
using std::string;      
using namespace apache::thrift;
using namespace apache::thrift::protocol;
using namespace apache::thrift::transport;

using namespace FileHandler;

using namespace boost;

int main(int argc, char** argv) {
    shared_ptr<TTransport> file(new TFileTransport("testfiletransport", true));
    shared_ptr<TTransport> transport(new TBufferedTransport(file));
    shared_ptr<TProtocol> protocol(new TBinaryProtocol(transport));
    FileServiceClient client(protocol);
    string fname;
    TMessageType mtype;
    int32_t seqid;
    Work w;
    try {
        while (true) {
            protocol->readMessageBegin(fname, mtype, seqid);
            cout << fname << endl;
            if (fname == "createFile") {
                FileService_createFile_args args;
                args.read(protocol.get());
                protocol->readMessageEnd();
                w = args.w;
            } else if (fname == "getFiles") {
                FileService_getFiles_args args;
                args.read(protocol.get());
                protocol->readMessageEnd();
                w = args.w;
            }
            cout <<"\tData:\t" << w.data << endl;
            cout <<"\tFilename:\t" << w.filename << endl;
            cout <<"\tRootdir:\t" << w.rootdir << endl;
            cout <<"\tOperation:\t" << w.op << endl;

        }
    } catch (TTransportException& e) {
        if (e.getType() == TTransportException::END_OF_FILE) {
            cout << "\n\n\t\tRead All Data successfully\n";
        }
    } catch (TException &tx) {
        cout << "ERROR " <<  tx.what() << endl;
    }
}


In the file_reader.cpp example, we first read the message type using readMessageBegin interface of binary protocol.  Then we use "fname" (function name) field to decide if the message is a "createFile" or "getFiles" message. Then we use use appropriate FileService args class to read the actual message.
After that we just print the message on our terminal :)

Hope this will be useful for you and if so please don't forget to leave a comment here :)

In my next post, I will show how we may build a simple messaging system using Apache Thrift. Hopefully you will visit again.