Talk:Foreign function interface
![]() | Computing Start‑class | |||||||||
|
![]() | Computer science Start‑class Low‑importance | ||||||||||||||||
|
Code example, with copyright question mark
I wrote the following code example, based on an O'Reilly book. The only part from the book is the code itself; the prose is all mine. Unfortunately the book is "all rights reserved" so I'm not sure that using it would be legal. I don't suppose a claim of fair use would be acceptable? If not (or it wouldn't be considered fair use anyway) I'll see if I can get permission. Hairy Dude (talk) 07:25, 25 April 2009 (UTC)
Haskell's FFI provides a raw "foreign import" mechanism, which, given a symbol in a particular header file and a name and appropriate type signature, will bring a Haskell function of that name and type into scope and bind the C function to it. By "appropriate type" we mean that the types of the arguments of the C function and its return type each have an associated Haskell type, which is often a black box. Part of the function of the FFI is to provide a way of marshalling Haskell data into the correct format for use by the C function, and marshalling the C data that it returns into a form that the Haskell code can use.
A further complication is that Haskell is a "pure" language: ordinary code does not permit side effects, whereas C is "impure": it allows unrestricted side effects. Haskell code that does have side effects is separated by having its type marked with "IO", which cannot usually be stripped off (to allow it would enable ostensibly "pure" code to have side effects). As a matter of fact, while it is rare for a C function to have no side effects at all, in many cases the only side effects are allocation of memory for the return value and modification of buffers which are passed in as arguments (so-called reentrant functions). These types of side effect can easily be encapsulated.
Finally, there is the issue of memory management: C usually explicitly allocates and deallocates memory using the pair malloc()
and free()
, whereas Haskell uses implicit allocation and garbage collection. If we are to use a data value returned by a C function in Haskell, we must arrange for it to be managed by the garbage collector and ensure that it gets disposed of in the proper way — by calling free()
— or we will get a memory leak.
In the following example (from the book Real World Haskell[1]), the function pcre_compile
, written in C as part of the PCRE library, is to be imported into Haskell.
-- Import the function. -- Note the C-native types such as "CString" (*char) and "Ptr CInt" (*int). foreign import ccall unsafe "pcre.h pcre_compile" c_pcre_compile :: CString -> PCREOption -> Ptr CString -> Ptr CInt -> Ptr Word8 -> IO (Ptr PCRE) ... -- The wrapper. Note the Haskell-native types ByteString and String. compile :: ByteString -> [PCREOption] -> Either String Regex compile str flags = unsafePerformIO $ useAsCString str $ pattern -> do alloca $ \errptr -> do alloca $ \erroffset -> do pcre_ptr <- c_pcre_compile pattern (combineOptions flags) errptr erroffset nullPtr if pcre_ptr == nullPtr then do err <- peekCString =<< peek errptr return (Left err) else do reg <- newForeignPtr finalizerFree pcre_ptr return (Right (Regex reg str))
The first two lines import the C function in question as c_pcre_compile
. The rest of the code is a wrapper compile
which provides a native Haskell interface for the function: it takes a ByteString instead of a CString (i.e. a null-terminated character array), and it reports errors through its return type, rather than by returning an invalid value (like a null pointer) and causing the side-effect of writing a string to a buffer, as the C function does.
Marshalling is performed in the following way:
useAsCString
converts the input ByteString into a C string.peek
dereferences the pointer to get at the C string it points to (the error message), whichpeekCString
turns into a String.newForeignPtr
creates a new managed pointer, essentially registering the pointer with the garbage collector so it reclaims the memory used by the data when it goes out of scope. This includes specifying the finalizer, which is a function to call on the pointer when it gets garbage-collected, to make sure it is disposed of properly. In this case,finalizerFree
will call the C functionfree()
.- The data constructor
Regex
encapsulates the managed pointer to the opaque type PCRE, representing the C type of the compiled regular expression, along with the string that it was compiled from, as a Haskell type.
The function also requires two temporary buffers, which are allocated using alloca
. Normally this would need to be told how much memory to allocate, but Haskell's type inference can provide this information automatically. Finally, after all the side-effects have been encapsulated, the otherwise dangerous function unsafePerformIO
strips the "IO" marker and allows the function to be used in a pure context.
- ^ O'Sullivan, Bryan; Goerzen, John; Stewart, Donald (2008). Real World Haskell. O'Reilly. ISBN 0596514980. Retrieved 25 April 2009.
Abbreviation markings
the template looks horrible in source code, clutters things. This kind of things must be automated to be useful and maintainable. Suggestion: remove from article. --Sigmundur (talk) 13:51, 29 September 2010 (UTC)
An FFI or a FFI
The article is using both inconsistently. — Preceding unsigned comment added by SimEdw (talk • contribs) 14:22, 8 March 2012 (UTC)