Remove duplicates from a string
July 2, 2012
Continuous sub-array with given sum
July 4, 2012

Check if two strings are anagrams

An anagram is a word play. If letters of a word can be arranged to form another word, the two words are called anagrams. e.g MARY & ARMY are anagrams because both of them are formed of characters A, M, R and Y.

Some interesting anagram sentences are (Not considering the space character as part of the string):

graduation = out in a drag
a decimal point = I’m a dot in place
A diet = I’d eat
Admirer = Married
Debit card = Bad credit
dormitory = dirty room
mother-in-law = woman Hitler

Anyway, the list will go on and on.. you can find it here…. The question is:
Write a function which will accept two strings and return true if they are anagrams.

Solution:

Method-1: Bruite-force

    For each char in string-1
        Check if the character is present in string-2
         - If yes,
            Remove it from string-2 (or set it to some special character like '-')
         - If No,
            Return False
        if(string-2 is null)
            Return True.
        else
            Return False

Sample Run

MARY, ARMY
    Char 'M', Found and removed from string-2.
MARY, AR-Y
    Char 'A', Found and removed from string-2.
MARY, -R-Y
    Char 'R', Found and removed from string-2.
MARY, ---Y
    Char 'Y', Found and removed from string-2.
MARY, ----
    Second string is Empty (all '-'), Return True.

Note: We need to check for String-2 being empty, because there can be more char in string-2 (for whom a match is not found in string-1.

The above method is destroying string-2 in the process. If you want to preserve string-2 then you may either copy the entire string-2 in a temp string or mark the characters of string-2 which are considered as ‘visited’.

Just checking whether a char is present in string-2 or not (and not removing or marking it) will not work because the char may be repeated in string-1 and not in string-2. If we are just checking, then we will find that the char is present in string-2, but that logic is wrong.

Code:

    /** return '1' if str1 & str2 are anagrams. '0' otherwise.
     */
    int isanagram(char *str1, char* str2)
    {
        // If both are NULL. Return true
        if(str1 == NULL && str2 == NULL)
            return true;
        // If one of them is null & other not.
        if(str1 == NULL || str2 == NULL || strlen(str1) != strlen(str2))
            return false;
        for(int i=0; str1[i] != '\0'; i++)
        {
            int found = false;
            for(int j=0; str2[j] != '\0'; j++)
            {
                if(str2[j] == str1[i])
                {
                    str2[j] = '-';
                    found = true;
                    break;
                }
            }
            if(!found)
                return false;
        }
        // Chech if str2 is empty.
        for(int i=0; str2[i] != '\0'; i++)
            if(str2[i] != '-')
                return false;
        return true;
    }

Time complexity: O(n2)
Extra Space Used: Constant. O(1)

Obviously, the bruite-force method is not a good method. Lets consider, some better methods.

Method-2: Using Hashing (count characters)

Create a count array which will keep the count of number of times  a char appears in the string.

int count[256] = {0}. // Initialize all to zero
// Increment the count array for characters present in str1
for(i=0; i<strlen(str1); i++)
    count[str1[i]] ++
// Decrement the count array for characters present in str2
for(i=0; i<strlen(str2); i++)
    count[str2[i]] --
// All the values in count should be zeros
for(i=0; i<256; i++)
    if(count[i] != 0)
        return FALSE;
// All are zeros. The two strings are anagrams
return TRUE

Code:

    /** return '1' if str1 & str2 are anagrams. '0' otherwise.
     */
    int isanagram2(char *str1, char* str2)
    {
        // If both are NULL. Return true
        if(str1 == NULL && str2 == NULL)
            return true;
        // If one of them is null & other not.
        if(str1 == NULL || str2 == NULL || strlen(str1) != strlen(str2))
            return false;
        int count[256] = {0}; // Initialize all to zero
        // Increment the count array for characters present in str1
        for(int i=0; i<strlen(str1); i++)
            count[str1[i]]++;
        // Decrement the count array for characters present in str2
        for(int i=0; i<strlen(str2); i++)
            count[str2[i]]--;
        // All the values in count should be zeros
        for(int i=0; i<256; i++)
            if(count[i] != 0)
            return false;
        // All are zeros. The two strings are anagrams
        return true;
    }

Time Complexity: O(n)
Extra Space Used: O(1). But large space used to store the count array.

Method-3: Use Sorting

It is worse than Method-2.

Sort the first string
Sort the second string
Compare the two strings character-by-character.
 If not same
    return FALSE.
 else
    return TRUE.

Code:

    /** ASSUMES THE IMPLEMENTATION OF quicksort METHOD TO SORT THE STRINGS.
     * return '1' if str1 & str2 are anagrams. '0' otherwise.
     */
    int isanagram2(char *str1, char* str2)
    {
        // If both are NULL. Return true
        if(str1 == NULL && str2 == NULL)
            return true;
        // If one of them is null & other not.
        if(str1 == NULL || str2 == NULL || strlen(str1) != strlen(str2))
            return false;
        quicksort(str1);
        quicksort(str2);
        // Compare char-by-char
        for(int i=0; str1[i] != '\0' ; i++)
            if(str1[i] != str2[i])
                return false;
        return true;
    }

Time Complexity: O(n.lg(n)) .. (Time taken in sorting the strings)
Extra Space used: constant.. O(1)

Leave a Reply

Your email address will not be published. Required fields are marked *