I am new to Haskell and I am trying to implement a few known algorithms in it.
I have implemented merge sort on strings. I am a bit disappointed with the performance of my Haskell implementation compared to C and Java implementations. On my machine (Ubuntu Linux, 1.8 GHz), C (gcc 4.3.3) sorts 1 000 000 strings in 1.85 s, Java (Java SE 1.6.0_14) in 3.68 s, Haskell (GHC 6.8.2) in 25.89 s. With larger input (10 000 000 strings), C takes 21.81 s, Java takes 59.68 s, Haskell starts swapping and I preferred to stop the program after several minutes.
Since I am new to Haskell, I would be interested to know if my implementation can be made more time / space efficient.
Thank you in advance for any hint Giorgio
My implementation:
merge :: [String] -> [String] -> [String]
merge [] ys = ys
merge xs [] = xs
merge (x:xs) (y:ys) = if x < y
then x : (merge xs (y:ys))
else y : (merge (x:xs) ys)
mergeSort :: [String] -> [String]
mergeSort xs = if (l < 2)
then xs
else merge h t
where l = length xs
n = l `div` 2
s = splitAt n xs
h = mergeSort (fst s)
t = mergeSort (snd s)
Try this version:
mergesort :: [String] -> [String]
mergesort = mergesort' . map wrap
mergesort' :: [[String]] -> [String]
mergesort' [] = []
mergesort' [xs] = xs
mergesort' xss = mergesort' (merge_pairs xss)
merge_pairs :: [[String]] -> [[String]]
merge_pairs [] = []
merge_pairs [xs] = [xs]
merge_pairs (xs:ys:xss) = merge xs ys : merge_pairs xss
merge :: [String] -> [String] -> [String]
merge [] ys = ys
merge xs [] = xs
merge (x:xs) (y:ys)
= if x > y
then y : merge (x:xs) ys
else x : merge xs (y:ys)
wrap :: String -> [String]
wrap x = [x]
Edit: Someone who down-vote this answer: above merge sort implementation is same algorithm as used in ghc Data.List.sort except with cmp function removed. Well ghc authors are may be wrong :-/