haskell - Parsec separator / terminator -


apparently i'm dumb figure out...

consider following string:

foobar(123, 456, 789) 

i'm trying work out how parse this. in particular,

call =   cs <- many1 letter   char '('   <- many argument   return (cs, as)  argument = manytill anychar (char ',' <|> char ')') 

this works perfectly, until add stuff end of input string, @ point tries parse stuff next argument, , gets upset when doesn't end comma or bracket.

fundamentally, trouble comma separator, while bracket terminator. parsec doesn't appear provide combinator that.

just make things more interesting, input string can be

foobar(123, 456, ... 

which indicates message incomplete. there appears no way of parsing sequence 2 possible terminators , knowing which one found. (i want know whether argument list complete or incomplete.)

can figure out how climb out of this?

you should exclude separator/terminator characters allowed characters function argument. also, can use between , sepby make difference between separators , terminators clearer:

call =   cs <- many1 letter   <- between (char '(') (char ')')       $ sepby (many1 (noneof ",)")) (char ',')   return (cs, as) 

however, still not want, because doesn't handle whitespace properly. should @ text.parsec.token more robust way this.

edit

with ...-addition, indeed becomes bit weird, , don't think nicely fits of predefined combinators, we'll have ourselves.

let's define type our results:

data args = string :. args | nil | dots   deriving show  infixr 5 :. 

that's list, has 2 different kinds of "empty list" distinguish ... case. of course, can use ([string], bool) result type, i'll leave exercise. following assumes have

import control.applicative ((<$>), (<*>), (<$), (*>)) 

the parsers become:

call =   cs <- many1 letter   char '('   <- args   return (cs, as)  args =       (:.) <$> arg <*> argcont   <|> dots <$ string "..."  arg = many1 (noneof ".,)")  argcont =       nil <$ char ')'   <|> char ',' *> args 

this handles fine except whitespace, original recommendation @ token parsers remains.

let's test:

ghci> parsetest call "foobar(foo,bar,baz)" ("foobar","foo" :. ("bar" :. ("baz" :. nil))) ghci> parsetest call "foobar(1,2,..." ("foobar","1" :. ("2" :. dots)) ghci> parsetest ((,) <$> call <*> call) "foo(1)bar(2,...)" (("foo","1" :. nil),("bar","2" :. dots)) 

Comments

Popular posts from this blog

matlab - "Contour not rendered for non-finite ZData" -

delphi - Indy UDP Read Contents of Adata -

javascript - Any ideas when Firefox is likely to implement lengthAdjust and textLength? -