Reputation: 2402
I have defined a custom type as follows:
-- Atom reference number, x coordinate, y coordinate, z coordinate, element symbol,
-- atom name, residue sequence number, amino acid abbreviation
type Atom = (Int, Double, Double, Double, Word8, ByteString, Int, ByteString)
I would like to gather all of the atoms with a certain residue sequence number nm.
This would be nice:
[x | x <- p, d == nm]
where
(_, _, _, _, _, _, d, _) = x
where p is a list of atoms.
However, this does not work because I can not access the variable x outside of the list comprehension, nor can I think of a way to access a specific tuple value from inside the list comprehension.
Is there a tuple method I am missing, or should I be using a different data structure?
I know I could write a recursive function that unpacks and checks every tuple in the list p, but I am actually trying to use this nested inside an already recursive function, so I would rather not need to introduce that complexity.
Upvotes: 1
Views: 1502
Reputation: 40797
This works:
[x | (_, _, _, _, _, _, d, _) <- p, d == nm]
However, you should really define your own data type here. A three-element tuple is suspicious; an eight-element tuple is very bad news indeed. Tuples are difficult to work with and less type-safe than data types (if you represent two different kinds of data with two tuples with the same element types, they can be used interchangeably). Here's how I'd write Atom
as a record:
data Point3D = Point3D Double Double Double
data Atom = Atom
{ atomRef :: Int
, atomPos :: Point3D
, atomSymbol :: Word8
, atomName :: ByteString
, atomSeqNum :: Int
, atomAcidAbbrev :: ByteString
} deriving (Eq, Show)
(The "atom" prefix is to avoid clashing with the names of fields in other records.)
You can then write the list comprehension as follows:
[x | x <- p, atomSeqNum x == nm]
As a bonus, your definition of Atom
becomes self-documenting, and you reap the benefits of increased type safety. Here's how you'd create an Atom
using this definition:
myAtom = Atom
{ atomRef = ...
, atomPos = ...
, ... etc. ...
}
By the way, it's probably a good idea to make some of the fields of these types strict, which can be done by putting an exclamation mark before the type of the field; this helps avoid space leaks from unevaluated thunks building up. For instance, since it doesn't make much sense to evaluate a Point3D
without also evaluating all its components, I would instead define Point3D
as:
data Point3D = Point3D !Double !Double !Double
It would probably be a good idea to make all the fields of Atom
strict too, although perhaps not all of them; for example, the ByteString
fields should be left non-strict if they're generated by the program, not always accessed and possibly large. On the other hand, if their values are read from a file, then they should probably be made strict.
Upvotes: 9
Reputation: 68172
You should definitely use a different structure. Instead of using a tuple, take a look at records.
data Atom = Atom { reference :: Int
, position :: (Double, Double, Double)
, symbol :: Word8
, name :: ByteString
, residue :: Int
, abbreviation :: ByteString
}
You can then do something like this:
a = Atom ...
a {residue=10} -- this is now a with a residue of 10
Upvotes: 4