haskell - What are the space complexities of inits and tails? -
tl; dr
after reading passage persistence in okasaki's purely functional data structures , going on illustrative examples singly linked lists (which how haskell's lists implemented), left wondering space complexities of data.list
's inits
, tails
...
it seems me that
- the space complexity of
tails
linear in length of argument, and - the space complexity of
inits
quadratic in length of argument,
but simple benchmark indicates otherwise.
rationale
with tails
, original list can shared. computing tails xs
consists in walking along list xs
, creating new pointer each element of list; no need recreate part of xs
in memory.
in contrast, because each element of inits xs
"ends in different way", there can no such sharing, , possible prefixes of xs
must recreated scratch in memory.
benchmark
the simple benchmark below shows there isn't of difference in memory allocation between 2 functions:
-- main.hs import data.list (inits, tails) main = let intrange = [1 .. 10 ^ 4] :: [int] print $ sum intrange print $ finits intrange print $ ftails intrange finits :: [int] -> int finits = sum . map sum . inits ftails :: [int] -> int ftails = sum . map sum . tails
after compiling main.hs
file with
ghc -prof -fprof-auto -o2 -rtsopts main.hs
and running
./main +rts -p
the main.prof
file reports following:
cost centre module %time %alloc finits main 60.1 64.9 ftails main 39.9 35.0
the memory allocated finits
, allocated ftails
have same order of magnitude... hum...
what going on?
- are conclusions space complexities of
tails
(linear) ,inits
(quadratic) correct? - if so, why ghc allocate memory
finits
,ftails
? list fusion have this? - or benchmark flawed?
the implementation of inits
in haskell report, identical or identical implementations used base 4.7.0.1 (ghc 7.8.3) horribly slow. in particular, fmap
applications stack recursively, forcing successive elements of result gets slower , slower.
inits [1,2,3,4] = [] : fmap (1:) (inits [2,3,4]) = [] : fmap (1:) ([] : fmap (2:) (inits [3,4])) = [] : [1] : fmap (1:) (fmap (2:)) ([] : fmap (3:) (inits [4])) ....
the simplest asymptotically optimal implementation, explored bertram felgenhauer, based on applying take
successively larger arguments:
inits xs = [] : go (1 :: int) xs go !l (_:ls) = take l xs : go (l+1) ls go _ [] = []
felgenhauer able eke performance out of using private, non-fusing version of take
, still not fast be.
the following simple implementation faster in cases:
inits = map reverse . scanl (flip (:)) []
in weird corner cases (like map head . inits
), simple implementation asymptotically non-optimal. therefore wrote version using same technique, based on chris okasaki's banker's queues, both asymptotically optimal , fast. joachim breitner optimized further, using strict scanl'
rather usual scanl
, , implementation got ghc 7.8.4. inits
can produce spine of result in o(n) time; forcing entire result requires o(n^2) time because none of conses can shared among different initial segments. if want absurdly fast inits
, tails
, best bet use data.sequence
; louis wasserman's implementation is magical. possibility use data.vector
—it presumably uses slicing such things.
Comments
Post a Comment