| View previous topic :: View next topic | | Author | Message |
|---|
grackle Guest
| Posted: Thu Nov 16, 2006 2:18 pm Post subject: Lazy sequence for reading file |
| I'm writing yet another flat file parser, and considering how often I do this I would very much like to write: (defun parse-test-case (stream) (map 'vector #'parse-line (read-lines stream))) I can do this easily myself if I write write read-lines to simply accumulate the lines in a sequence. I would rather do it lazily, though, to avoid reading the entire file into memory. As far as I can tell, a list is a list is a list. It's a concrete thing, so I can't create my own list type. That means I need to use a sequence type. Is there an existing sequence type that does this? If not, I need to implement a sequence type. Section 17.1 of the Hyperspec listst sequence functions; are these generic functions that I need to specialize for my sequence type? Is there a smaller subset that is sufficient? Also, I often want to throw away the map results, so I want X where map : X :: mapcar : mapc. I will need to write this myself, correct? Thanks, David |
| | Back to top | |  | Andrew Philpot Guest
| Posted: Thu Nov 16, 2006 3:19 pm Post subject: Re: Lazy sequence for reading file |
| In article <1163701117.305692.82890@f16g2000cwb.googlegroups.com>, grackle wrote:
| Quote: | I'm writing yet another flat file parser, and considering how often I do this I would very much like to write: (defun parse-test-case (stream) (map 'vector #'parse-line (read-lines stream))) I can do this easily myself if I write write read-lines to simply accumulate the lines in a sequence. I would rather do it lazily, though, to avoid reading the entire file into memory. As far as I can tell, a list is a list is a list. It's a concrete thing, so I can't create my own list type. That means I need to use a sequence type.
|
You could implement a lazy list, and give it an API that mirrors that of built-on CONS-constructed lists, starting with MAP, etc. Since you don't use the lazy list as a data structure (assign, mutate, etc.), one might argue this would be of little practical benefit. In Lisp, "lazy" sometimes ends up as "implemented with a closure." Another way to do this would be do forgo the lazy list directly and define a function that embodies the closure: (map-over-lines-of-file #'parse-line stream) or a macro wrapping of something like the above (with-lines-from-file (stream file) (parse-line stream)) Of course you could implement the lazy list, then the mapper on top of it, then the macro on top of it! When I recapitulate this kind of idiom, my solutions look like the WITH-LINES-FROM-FILE version. Not surprisingly, these tend to take quite a few keyword arguments like WITH-OPEN-FILE (often but not always defaulted). Including all of these might tend to make your READ-LINES lazy list specification a lot less perspicuous. With either but particularly well with the WITH-LINES-FROM-FILE syntax, you can use either functional filters or control flow to limit what gets collected...
| Quote: | Is there an existing sequence type that does this? If not, I need to implement a sequence type. Section 17.1 of the Hyperspec listst sequence functions; are these generic functions that I need to specialize for my sequence type? Is there a smaller subset that is sufficient? Also, I often want to throw away the map results, so I want X where map : X :: mapcar : mapc. I will need to write this myself, correct?
|
For this, I think you want MAP with first argument output-type-spec NIL.
#A |
| | Back to top | |  | Guest
| Posted: Thu Nov 16, 2006 3:44 pm Post subject: Re: Lazy sequence for reading file |
| grackle wrote:
| Quote: | I'm writing yet another flat file parser, and considering how often I do this I would very much like to write: (defun parse-test-case (stream) (map 'vector #'parse-line (read-lines stream))) I can do this easily myself if I write write read-lines to simply accumulate the lines in a sequence. I would rather do it lazily, though, to avoid reading the entire file into memory. As far as I can tell, a list is a list is a list. It's a concrete thing, so I can't create my own list type. That means I need to use a sequence type. Is there an existing sequence type that does this? If not, I need to implement a sequence type. Section 17.1 of the Hyperspec listst sequence functions; are these generic functions that I need to specialize for my sequence type? Is there a smaller subset that is sufficient? Also, I often want to throw away the map results, so I want X where map : X :: mapcar : mapc. I will need to write this myself, correct?
|
I'm confused as to what you're trying to do... you want to be able to call PARSE-LINE on the result of READ-LINE for each line of the file and collect the results in a vector? Do you know the length of the vector beforehand? (defun parse-test-case (stream) (let ((done (gensym)) xx) (do ((line (read-line stream nil done) (read-line stream nil done)) ((eq line done) (coerce (nreverse xx) 'vector)) (push (parse-line line) xx)))) or is your problem not being able to use a generic MAPC/MACPAR in this situation? (defun do-lines ((var stream &optional return) &body body) (let ((done (gensym)) (s (gensym))) `(let ((,s ,stream)) (do ((var (read-line ,s nil ,done) (read-line ,s nil ,done))) ((eq ,var ,done) ,return) ,@body)))) (defun mapc-lines (fn stream) (do-lines (l stream) (funcall fn l))) (defun mapcar-lines (fn stream) (let (xx) (do-lines (l stream (nreverse xx)) (push (funcall fn l) xx)))) (defun parse-test-case (stream) (coerce (mapcar-lines #'parse-line stream) 'vector)) hth Nick
|
| | Back to top | |  | grackle Guest
| Posted: Thu Nov 16, 2006 4:39 pm Post subject: Re: Lazy sequence for reading file |
| nallen05@gmail.com wrote:
| Quote: | I'm confused as to what you're trying to do... you want to be able to call PARSE-LINE on the result of READ-LINE for each line of the file and collect the results in a vector? Do you know the length of the vector beforehand?
|
Sometimes I want to collect, sometimes not. I usually don't know the length of the input beforehand.
| Quote: | or is your problem not being able to use a generic MAPC/MACPAR in this situation?
|
Yes. There are many macros and functions that accept sequence types, and I don't want to reimplement every one I use for every sequence-like type I create! I would like to create a sequence object that allows me to use any macro or function that expects a sequence. Andrew Philpot wrote:
| Quote: | You could implement a lazy list, and give it an API that mirrors that of built-on CONS-constructed lists, starting with MAP, etc.
|
The list and sequence APIs are there and they work. I think all I need to know is how to create a subtype of list or sequence. Then only a single function make-lazy-readline-list or make-lazy-readline-sequence is needed to take advantage of map, mapc, mapcar, dolist, etc.
| Quote: | Also, I often want to throw away the map results, so I want X where map : X :: mapcar : mapc. I will need to write this myself, correct? For this, I think you want MAP with first argument output-type-spec NIL.
|
Thanks! -David |
| | Back to top | |  | Barry Margolin Guest
| Posted: Thu Nov 16, 2006 10:01 pm Post subject: Re: Lazy sequence for reading file |
| In article <1163709551.403279.58060@m73g2000cwd.googlegroups.com>, "grackle" <davidhuebel@gmail.com> wrote:
| Quote: | Yes. There are many macros and functions that accept sequence types, and I don't want to reimplement every one I use for every sequence-like type I create! I would like to create a sequence object that allows me to use any macro or function that expects a sequence.
|
Sequences are not object-oriented, so there's no standard way to extend the sequence functions. You also can't define new arithmetic types -- for instance, you can't make the + function operate on matrices. -- Barry Margolin, barmar@alum.mit.edu Arlington, MA *** PLEASE post questions in newsgroups, not directly to me *** *** PLEASE don't copy me on replies, I'll read them in the group *** |
| | Back to top | |  | Lars Rune Nøstdal Guest
| Posted: Fri Nov 17, 2006 5:05 am Post subject: Re: Lazy sequence for reading file |
| On Thu, 16 Nov 2006 21:01:54 -0500, Barry Margolin wrote:
| Quote: | In article <1163709551.403279.58060@m73g2000cwd.googlegroups.com>, "grackle" <davidhuebel@gmail.com> wrote: Yes. There are many macros and functions that accept sequence types, and I don't want to reimplement every one I use for every sequence-like type I create! I would like to create a sequence object that allows me to use any macro or function that expects a sequence. Sequences are not object-oriented, so there's no standard way to extend the sequence functions. You also can't define new arithmetic types -- for instance, you can't make the + function operate on matrices.
|
cl:+ isn't a method, but you could create a my-cl:+ that is: (defpackage :my-cl (:use :cl) (:shadow :+)) (in-package :my-cl) (defclass not-really-a-matrix () ((a :accessor a-of :initarg :a :initform 0) (b :accessor b-of :initarg :b :initform 0))) (defmethod + ((x number) &rest args) (apply #'cl:+ x args)) (defmethod + ((x not-really-a-matrix) &rest args) (apply #'+ (a-of x) (b-of x) args)) ...or something..then `:use my-cl' in your application-package. -- Lars Rune Nøstdal http://lars.nostdal.org/ |
| | Back to top | |  | Joel Ray Holveck Guest
| Posted: Fri Nov 17, 2006 6:30 am Post subject: Re: Lazy sequence for reading file |
| On Nov 16, 12:39 pm, "grackle" <davidhue...@gmail.com> wrote:
| Quote: | situation?Yes. There are many macros and functions that accept sequence types, and I don't want to reimplement every one I use for every sequence-like type I create! I would like to create a sequence object that allows me to use any macro or function that expects a sequence.
|
First: There are no facilities to write your own sequence types and have them work with the Common Lisp sequence functions. Those aren't generic functions. As others have mentioned, you could easily write your own package that re-exports the bulk of the CL symbols, and shadows the CL sequence functions with generic replacements. Presumably, the default primary method of the generic replacements would call the CL version. Second: All of the built-in sequence functions are written with the assumption that the entire contents of the sequence are available immediately. For example, if you fed MAP a lazy sequence, you would probably like to be able to get a lazy sequence back from it... but the MAP function was written to apply itself to everything immediately. Reusing the Common Lisp sequence functions on lazy sequences is futile, since they're written with different assumptions in mind. Dick Waters did some work regarding lazy value generation. Some implementations might support his proposals, or there might be libraries out there. joelh |
| | Back to top | |  | Christophe Rhodes Guest
| Posted: Fri Nov 17, 2006 7:42 am Post subject: Re: Lazy sequence for reading file |
| "Joel Ray Holveck" <joelh@piquan.org> writes:
| Quote: | First: There are no facilities to write your own sequence types and have them work with the Common Lisp sequence functions. Those aren't generic functions.
|
This is true at the moment, though nothing (or almost nothing: see CDR 3) in the standard prevents an implementation from allowing this.
| Quote: | Second: All of the built-in sequence functions are written with the assumption that the entire contents of the sequence are available immediately. For example, if you fed MAP a lazy sequence, you would probably like to be able to get a lazy sequence back from it... but the MAP function was written to apply itself to everything immediately. Reusing the Common Lisp sequence functions on lazy sequences is futile, since they're written with different assumptions in mind.
|
The MAP example is a good one, because it's not amenable to specialization in the ordinary CLOS manner. However, for many of the CL sequence functions it is in fact possible to get the lazy -> lazy behaviour that one might want from sequence functions, so one could imagine (say) REMOVE implemented lazily on a lazy sequence. Cheers, Christophe |
| | Back to top | |  | Wade Humeniuk Guest
| Posted: Fri Nov 17, 2006 10:32 am Post subject: Re: Lazy sequence for reading file |
| grackle wrote:
| Quote: | I'm writing yet another flat file parser, and considering how often I do this I would very much like to write: (defun parse-test-case (stream) (map 'vector #'parse-line (read-lines stream))) I can do this easily myself if I write write read-lines to simply accumulate the lines in a sequence. I would rather do it lazily, though, to avoid reading the entire file into memory. As far as I can tell, a list is a list is a list. It's a concrete thing, so I can't create my own list type. That means I need to use a sequence type. Is there an existing sequence type that does this? If not, I need to implement a sequence type. Section 17.1 of the Hyperspec listst sequence functions; are these generic functions that I need to specialize for my sequence type? Is there a smaller subset that is sufficient? Also, I often want to throw away the map results, so I want X where map : X :: mapcar : mapc. I will need to write this myself, correct?
|
It's been done before. Look at the SERIES package. http://series.sourceforge.net/ Wade |
| | Back to top | |  | grackle Guest
| Posted: Fri Nov 17, 2006 10:50 am Post subject: Re: Lazy sequence for reading file |
| Joel Ray Holveck wrote:
| Quote: | First: There are no facilities to write your own sequence types and have them work with the Common Lisp sequence functions.
|
Thanks; this is the answer I needed. I would be curious to hear why.
| Quote: | Second: All of the built-in sequence functions are written with the assumption that the entire contents of the sequence are available immediately. For example, if you fed MAP a lazy sequence, you would probably like to be able to get a lazy sequence back from it... but the MAP function was written to apply itself to everything immediately. Reusing the Common Lisp sequence functions on lazy sequences is futile, since they're written with different assumptions in mind.
|
It would work in the case where you want to immediately process as much of the sequence as is available, which would include reading from a file or a socket. Potentially infinite sequences could be used as long as one of the sequences being mapped could be depended on to block or terminate when appropriate: (mapc (lambda (line number) (format t "Line ~a: ~a~%" number line)) (make-readlines-sequence my-input-stream) (make-integer-sequence :start 1 :step 1)) I'm just looking for the most carpal-friendly way to parse flat files, so Nick's suggestion is the way to go, I think. -David |
| | Back to top | |  | Barry Margolin Guest
| Posted: Fri Nov 17, 2006 11:52 am Post subject: Re: Lazy sequence for reading file |
| In article <pan.2006.11.17.09.05.11.720229@gmail.com>, Lars Rune Nøstdal <larsnostdal@gmail.com> wrote:
| Quote: | On Thu, 16 Nov 2006 21:01:54 -0500, Barry Margolin wrote: In article <1163709551.403279.58060@m73g2000cwd.googlegroups.com>, "grackle" <davidhuebel@gmail.com> wrote: Yes. There are many macros and functions that accept sequence types, and I don't want to reimplement every one I use for every sequence-like type I create! I would like to create a sequence object that allows me to use any macro or function that expects a sequence. Sequences are not object-oriented, so there's no standard way to extend the sequence functions. You also can't define new arithmetic types -- for instance, you can't make the + function operate on matrices. cl:+ isn't a method, but you could create a my-cl:+ that is: .... ..or something..then `:use my-cl' in your application-package.
|
That won't make all the functions not written by him use it. Relating this back to the OP's problem, he wants built-in functions like MAPCAR to be able to iterate over his lazy sequences. To make use of your suggestion he would have to create MY-CL:MAPCAR, MY-CL:LOOP, MY-CL:DO, etc. Compare this will languages like CLU and C++, which have a standard generic protocol for iterators, so that programmers can define their own sequences and make use of the built-in for-loop. -- Barry Margolin, barmar@alum.mit.edu Arlington, MA *** PLEASE post questions in newsgroups, not directly to me *** *** PLEASE don't copy me on replies, I'll read them in the group *** |
| | Back to top | |  | grackle Guest
| Posted: Fri Nov 17, 2006 2:58 pm Post subject: Re: Lazy sequence for reading file |
| Barry Margolin wrote:
| Quote: | Compare this will languages like CLU and C++, which have a standard generic protocol for iterators, so that programmers can define their own sequences and make use of the built-in for-loop.
|
This is what I'm accustomed to and was expecting. It's very un-Lispy to make a fundamental part of the language off-limits to users. -David |
| | Back to top | |  | Joel Ray Holveck Guest
| Posted: Fri Nov 17, 2006 6:49 pm Post subject: Re: Lazy sequence for reading file |
| Christophe Rhodes wrote:
| Quote: | The MAP example is a good one, because it's not amenable to specialization in the ordinary CLOS manner. However, for many of the CL sequence functions it is in fact possible to get the lazy -> lazy behaviour that one might want from sequence functions, so one could imagine (say) REMOVE implemented lazily on a lazy sequence.
|
Oh, I think you misunderstand me. I'm just saying that the only thing that he could reuse would be the names... the implementations wouldn't be reusable. Sure you could implement REMOVE against a lazy sequence, but it would have to be an all-new implementation. joelh |
| | Back to top | |  | |
|