This is a Gabriel Gonzalez answer about combining of several mapM/filters with Pipes, Streaming, etc libraries:
---cut---
I think it's important to distinguish between two separate concepts: "fusion" vs "one-pass". "Fusion" refers to avoiding the allocation of intermediate data structures when you transform a stream multiple times whereas "one-pass" means that you don't traverse the sequence of elements twice when you transform the stream multiple times (i.e. you go over the stream in one pass). You can have a "one-pass" implementation without "fusion" but you cannot have "fusion" without a "one-pass" implementation.
"Fusion" is purely an optimization, meaning that whether or not an implementation uses "fusion" only affects your program's performance but won't affect its behavior. However, "one-pass" is not just an optimization: one-pass versus multiple pass changes the behavior of your program, especially once your stream has effects like in these streaming libraries.
Out of the two properties, "one-pass" is *much* more important. The reason why is that "one-pass" ensures that certain functor laws hold. To see why, let's consider a case where they *don't* hold, which is `Data.List.mapM`. Normally, you mtigh expect the following functor laws to hold:
Data.List.mapM (f <=< g) = Data.List.mapM f . Data.List.mapM g
Data.List.mapM return = id
However, the above two laws don't actually hold for `Data.List.mapM`. For example, the first law does not hold because the order of effects are not the same for the left-hand and right-hand sides of the equation. The left-hand side of the equation interleaves the effects of `f` and `g` whereas the right-hand side runs all of `g`'s effects first followed by all of `f`'s effects. The second equation is also wrong because `Data.List.mapM` misbehaves on infinite lists:
Data.List.mapM return (repeat x) = _|_
id (repeat x) = repeat x
This is the root of why `Data.List.mapM` is "bad"
However, the streaming libraries have their own versions of `mapM` which do obey the above functor laws. For example, if you take the `list-transformer` library and define:
mapM :: (a -> m b) -> ListT m a -> ListT m b
mapM f as = do
a <- as="" br="">->
lift (f a)
... then this *does* obey the following functor laws:
mapM (f <=< g) = mapM f . mapM g
mapM return = id
For the first equation, both sides of the equation interleave the effects of `f` and `g`. For the second equation, both sides of the equation behave correctly on infinite `ListT` streams. These functor laws hold because `ListT` is has the "one-pass" property.
So to answer your question: it's not exactly the Haskell `Functor` type class per se that is important here, but the functor laws are important (for a more general notion of functor) in establishing why a single pass implementation matters.
---cut---
Thanks, Gabriel!
Комментариев нет:
Отправить комментарий
Thanks for your posting!