суббота, 26 марта 2016 г.

Python groupby pitfall

Python function groupby has one feature which can be cause of unexpected pitfall:

...The returned group is itself an iterator that shares the underlying iterable with groupby(). Because the source is shared, when the groupby() object is advanced, the previous group is no longer visible. So, if that data is needed later, it should be stored as a list...

Example:

from itertools import *

input = ((1, "one"), (1, "one too"), (1, "also one"),
         (2, "two"), (2, "another two"), (2, "two again")
)

grp = groupby(input, lambda p: p[0])
# grp = map(lambda p: (p[0], list(p[1])), grp)   # (A)
maxg, maxi = max(grp, key=lambda g: g[0])
print("maxg =", maxg, "maxi =", list(maxi))

Output without (A) is:

maxg = 2 maxi = [(2, 'two again')]

which is unexpected because we suppose to find "last" group to iterate than over it's items (they are 3!).

But with (A) is:

maxg = 2 maxi = [(2, 'two'), (2, 'another two'), (2, 'two again')]

This happens because sub-iterators of groups are not independent, iteration over groups "eats" sub-iterators too.

среда, 23 марта 2016 г.

Iterative "in-place" Haskell factorial

This is the implementation of factorial which does not use recursion and does not pass result via stack but use 2 "global" vars: counter abd factorial value. For example:

fact 5 = 120

       counter  | fact
       ---------+-----------
        5       | 5
        4       | 20
        3       | 60
        2       | 120
        1       | 120 => end

Instead of pass temporary result of multiplication via stack, we'll create 2 states and use them as counter and factorial value. How to create 2 states? For one state we can use State monad, for 2... state in the state. Adding state to another monad (possible state too) is responsibility of StateT transformer. Calculation will looks like StateT Integer (State Integer) (). This means that we use Integer for counter and multiplication result, nothing will be returned as result because we use as result the state.

import Control.Monad
import Control.Monad.Trans.State
import Control.Monad.Trans.Class


-- | StateT s m a :: * -> (* -> *) -> * -> *
-- here "external" is in signature, not actually - in call
fact1 :: StateT Integer (State Integer) ()
fact1 = do
  i <- get              -- ^ counter: external state
  when (i > 1) $ do     -- ^ if counter > 1
    lift $ modify (i*)  -- ^ mult internal state by counter (initial: 1)
    modify (-1+)        -- ^ decrement external state
    fact1               -- ^ to next iteration, changed states


fact :: Integer -> Integer
fact n = execState (execStateT fact1 n) 1
-- ^ external (internal in signature) state is fact, internal is counter


main :: IO ()
main = do
  putStrLn "Result: "
  print $ fact 5

Implementation is simple. First, signature describes only types, not actual nesting structure, so in function body "internal" is the internal in signature. lift/get are relative for internal/external but in termins of signature. Really after transformation you will works with stack-like structure, so when you call your function, you should create all in reverse order: internal becomes external, so we get counter as external state but pass it in call as internal. No collision here: it's actually internal, only in signature it's looks like external, because signature looks like application of StateT (externally) to argument State.

We nothing return because there is a way to return state instead of result. This does exec* functions family (execState is needed in external call).

воскресенье, 13 марта 2016 г.

Harmonic series

This is the very simple interview question: check that series are harmonic, i.e. list of numbers are line (kx + b) - two close elements always have the same difference (k).

Naive implementation is:

-- Naive version of checking if sequence of numbers are harmonic (difference
-- between neighbords is the const)

-- Folding function with args (prev, diff, result)
rf (p, d, Nothing) c = (c, d', Just True) where d' = c - p
rf (p, d, Just False) c = (c, d, Just False)
rf (p, d, Just True) c = (c, d', Just $ d == d') where d' = c - p

harm [] = False
harm (hd:xs) = let (_, _, Just x) = foldl rf (hd, 0, Nothing) xs in x


main = do
  print $ harm [1, 2, 3, 4, 5, 6, 9]
  print $ harm [1, 2, 3, 4, 5, 6]
  print $ harm [2, 4, 6, 8]

where we are folding items and keep on each step tuple (prev, diff, result), where result is Just True, Just False or Nothing on first item. Actually this code has problem: it can not process harm [2] or harm [].

Another version may calculate difference between neighboards and then to calculate difference in the same way again:

1    2    3    4    6
  -1   -1   -1   -2
     0    0   -1!

After it we can summarize all items and if result is 0 then we have harmonic numbers:

neighOp op [] = []
neighOp op l@(hd:xs) = zipWith op l xs

harm [] = False
harm [x] = False
harm ns = 0 == (foldl1 (+) $ neighOp (-) $ (neighOp (-) ns))


main = do
  print $ harm [1, 2, 3, 4, 5, 6, 9]
  print $ harm [1, 2, 3, 4, 5, 6]
  print $ harm [2, 4, 6, 8]
  print $ harm [2]
  print $ harm []

3d version does not folding all items but drops all 0's until find first non zero which means that non-harmonic series are found:

neighOp op [] = []
neighOp op l@(hd:xs) = zipWith op l xs

harm [] = False
harm [x] = False
harm ns = case dropWhile (==0) $ neighOp (-) $ neighOp (-) ns of
  [] -> True
  _ -> False


main = do
  print $ harm [1, 2, 3, 4, 5, 6, 9]
  print $ harm [1, 2, 3, 4, 5, 6]
  print $ harm [2, 4, 6, 8]
  print $ harm [2]
  print $ harm []

Case will be lazy so calculations happens until first non-zero item in 2nd difference.

Quicksort in Haskell in imperative style

This is the Haskell implementation of quicksort but in imperative style; yes, it's terrible and very naive implementation :-)

Difference between all C/C++/Java implementations from the Web is that all indexes are checking on out of bounds.

-- exit from array bounds is checking !
import Data.Array
import qualified Data.Array.ST as DAS
import qualified Control.Monad.ST as CMS
import qualified Data.Array.MArray as DAM


findLeft :: (Ix i, Num i, Ord e) => DAS.STArray s i e -> i -> e -> CMS.ST s i
findLeft arr l p = do
  let n = l + 1
  e <- DAM.readArray arr l
  b <- DAM.getBounds arr
  if e < p && (inRange b n) then findLeft arr n p else return l


findRight :: (Ix i, Num i, Ord e) => DAS.STArray s i e -> i -> e -> CMS.ST s i
findRight arr r p = do
  let n = r - 1
  e <- DAM.readArray arr r
  b <- DAM.getBounds arr
  if e > p && (inRange b n) then findRight arr n p else return r


swap :: Ix i => DAS.STArray s i e -> i -> i -> CMS.ST s ()
swap arr i j = do
  vi <- DAM.readArray arr i
  vj <- DAM.readArray arr j
  DAM.writeArray arr i vj
  DAM.writeArray arr j vi


swapAll :: (Ix i, Num i, Ord e) => DAS.STArray s i e -> i -> i -> e -> CMS.ST s (i, i)
swapAll arr l r p = do
  l' <- findLeft arr l p
  r' <- findRight arr r p
  b <- DAM.getBounds arr
  let nl = l' + 1
      nr = r' - 1
  if l' <= r' && (inRange b nl) && (inRange b nr) then do
    swap arr l' r'
    swapAll arr (l' + 1) (r' - 1) p
  else return (l', r')


qsortArray :: (Integral i, Ix i, Ord e) => DAS.STArray s i e -> i -> i -> CMS.ST s ()
qsortArray arr i0 i1 =
  let l = i0
      r = i1
      mid = (i0 + i1) `div` 2 -- pivot index
  in do
    p <- DAM.readArray arr mid -- pivot
    (l', r') <- swapAll arr l r p
    if i0 < r' then do
      qsortArray arr i0 r'
      else return ()
    if l' < i1 then do
      qsortArray arr l' i1
      else return ()


qsort  :: (Num e, Ord e) => [e] -> [e]
qsort xs =
  let len = length xs
      i1 = len - 1
  in elems $ DAS.runSTArray $
    do
      arr <- DAM.newListArray (0, i1) xs
      qsortArray arr 0 i1
      return arr


main :: IO()
main = print $ qsort [0, 0, 9, 10, 4, 3, 98, 100, 99, 2, 6, 1]