| Copyright | (c) The University of Glasgow 2001-2009 (c) Giovanni Campagna <gcampagn@cs.stanford.edu> 2014 | 
|---|---|
| License | BSD-style (see the file LICENSE) | 
| Maintainer | libraries@haskell.org | 
| Stability | unstable | 
| Portability | non-portable (GHC Extensions) | 
| Safe Haskell | Safe-Inferred | 
| Language | Haskell2010 | 
GHC.Compact
Description
This module provides a data structure, called a Compact, for
 holding immutable, fully evaluated data in a consecutive block of memory.
 Compact regions are good for two things:
- Data in a compact region is not traversed during GC; any incoming pointer to a compact region keeps the entire region live. Thus, if you put a long-lived data structure in a compact region, you may save a lot of cycles during major collections, since you will no longer be (uselessly) retraversing this data structure.
- Because the data is stored contiguously, you can easily dump the memory to disk and/or send it over the network. For applications that are not bandwidth bound (GHC's heap representation can be as much of a x4 expansion over a binary serialization), this can lead to substantial speedups.
For example, suppose you have a function loadBigStruct :: IO BigStruct,
 which loads a large data structure from the file system.  You can "compact"
 the structure with the following code:
do r <-compact=<< loadBigStruct let x =getCompactr :: BigStruct -- Do things with x
Note that compact will not preserve internal sharing; use
 compactWithSharing (which is 10x slower) if you have cycles and/or
 must preserve sharing.  The Compact pointer r can be used
 to add more data to a compact region; see compactAdd or
 compactAddWithSharing.
The implementation of compact regions is described by:
- Edward Z. Yang, Giovanni Campagna, Ömer Ağacan, Ahmed El-Hassany, Abhishek Kulkarni, Ryan Newton. "/Efficient communication and Collection with Compact Normal Forms/". In Proceedings of the 20th ACM SIGPLAN International Conference on Functional Programming. September 2015. http://ezyang.com/compact.html
This library is supported by GHC 8.2 and later.
Synopsis
- data Compact a = Compact Compact# a (MVar ())
- compact :: a -> IO (Compact a)
- compactWithSharing :: a -> IO (Compact a)
- compactAdd :: Compact b -> a -> IO (Compact a)
- compactAddWithSharing :: Compact b -> a -> IO (Compact a)
- getCompact :: Compact a -> a
- inCompact :: Compact b -> a -> IO Bool
- isCompact :: a -> IO Bool
- compactSize :: Compact a -> IO Word
- compactResize :: Compact a -> Word -> IO ()
- mkCompact :: Compact# -> a -> State# RealWorld -> (# State# RealWorld, Compact a #)
- compactSized :: Int -> Bool -> a -> IO (Compact a)
The Compact type
A Compact contains fully evaluated, pure, immutable data.
Compact serves two purposes:
- Data stored in a Compacthas no garbage collection overhead. The garbage collector considers the wholeCompactto be alive if there is a reference to any object within it.
- A Compactcan be serialized, stored, and deserialized again. The serialized data can only be deserialized by the exact binary that created it, but it can be stored indefinitely before deserialization.
Compacts are self-contained, so compacting data involves copying
 it; if you have data that lives in two Compacts, each will have a
 separate copy of the data.
The cost of compaction is fully evaluating the data + copying it. However,
 because compact does not stop-the-world, retaining internal sharing during
 the compaction process is very costly. The user can choose whether to
 compact or compactWithSharing.
When you have a Compact agetCompact.  The Compact type
 serves as handle on the region itself; you can use this handle
 to add data to a specific Compact with compactAdd or
 compactAddWithSharing (giving you a new handle which corresponds
 to the same compact region, but points to the newly added object
 in the region).  At the moment, due to technical reasons,
 it's not possible to get the Compact aa,
 so make sure you hold on to the handle as necessary.
Data in a compact doesn't ever move, so compacting data is also a way to pin arbitrary data structures in memory.
There are some limitations on what can be compacted:
- Functions. Compaction only applies to data.
- Pinned ByteArray#objects cannot be compacted. This is for a good reason: the memory is pinned so that it can be referenced by address (the address might be stored in a C data structure, for example), so we can't make a copy of it to store in theCompact.
- Objects with mutable pointer fields (e.g. IORef,MutableArray) also cannot be compacted, because subsequent mutation would destroy the property that a compact is self-contained.
If compaction encounters any of the above, a CompactionFailed
 exception will be thrown by the compaction operation.
Compacting data
compact :: a -> IO (Compact a) #
Compact a value. O(size of unshared data)
If the structure contains any internal sharing, the shared data
 will be duplicated during the compaction process.  This will
 not terminate if the structure contains cycles (use compactWithSharing
 instead).
The object in question must not contain any functions or data with mutable
 pointers; if it does, compact will raise an exception. In the future, we
 may add a type class which will help statically check if this is the case or
 not.
compactWithSharing :: a -> IO (Compact a) #
Compact a value, retaining any internal sharing and cycles. O(size of data)
This is typically about 10x slower than compact, because it works
 by maintaining a hash table mapping uncompacted objects to
 compacted objects.
The object in question must not contain any functions or data with mutable
 pointers; if it does, compact will raise an exception. In the future, we
 may add a type class which will help statically check if this is the case or
 not.
compactAdd :: Compact b -> a -> IO (Compact a) #
Add a value to an existing Compact.  This will help you avoid
 copying when the value contains pointers into the compact region,
 but remember that after compaction this value will only be deallocated
 with the entire compact region.
Behaves exactly like compact with respect to sharing and what data
 it accepts.
compactAddWithSharing :: Compact b -> a -> IO (Compact a) #
Add a value to an existing Compact, like compactAdd,
 but behaving exactly like compactWithSharing with respect to sharing and
 what data it accepts.
Inspecting a Compact
getCompact :: Compact a -> a #
Retrieve a direct pointer to the value pointed at by a Compact reference.
 If you have used compactAdd, there may be multiple Compact references
 into the same compact region. Upholds the property:
inCompact c (getCompact c) == True
Check if the argument is in any Compact.  If true, the value in question
 is also fully evaluated, since any value in a compact region must
 be fully evaluated.
compactSize :: Compact a -> IO Word #
Returns the size in bytes of the compact region.
Other utilities
compactResize :: Compact a -> Word -> IO () #
Experimental This function doesn't actually resize a compact region; rather, it changes the default block size which we allocate when the current block runs out of space, and also appends a block to the compact region.
Internal operations
mkCompact :: Compact# -> a -> State# RealWorld -> (# State# RealWorld, Compact a #) #
Make a new Compact object, given a pointer to the true
 underlying region.  You must uphold the invariant that a lives
 in the compact region.
Arguments
| :: Int | Size of the compact region, in bytes | 
| -> Bool | Whether to retain internal sharing | 
| -> a | |
| -> IO (Compact a) | 
Transfer a into a new compact region, with a preallocated size (in
 bytes), possibly preserving sharing or not.  If you know how big the data
 structure in question is, you can save time by picking an appropriate block
 size for the compact region.