From alec at sensi.org Fri May 1 07:49:35 2020 From: alec at sensi.org (Alexander Voropay) Date: Fri, 1 May 2020 00:49:35 +0300 Subject: [TUHS] as(1) on Ultrix-11 vs 2.11BSD In-Reply-To: References: <9A1BF33E-49C9-4712-BF25-4C0BBC504CD1@planet.nl> Message-ID: Can anyone please explain the last $0 pushed to the stack ? Early SysIII ans SYSV on the i386 (and may be on i286) used similar syscall convention. I wrote about this: https://minnie.tuhs.org/pipermail/tuhs/2019-October/019274.html https://minnie.tuhs.org/pipermail/tuhs/2019-October/019294.html Example: === .file "test.s" .version "02.01" .set WRITE,4 .set EXIT,1 .text .align 4 .globl entry entry: pushl %ebp movl %esp,%ebp subl $8,%esp pushl $14 /length pushl $hello pushl $1 /STDOUT pushl $0 movl $WRITE,%eax lcall $0x07,$0 addl $16,%esp pushl $0 movl $EXIT,%eax lcall 0x07,$0 .data .align 4 hello: .byte 0x48,0x65,0x6c,0x6c,0x6f,0x2c, 0x20,0x77,0x6f,0x72 .byte 0x6c,0x64,0x21,0x0a,0x00 ср, 29 апр. 2020 г. в 17:19, : > > Thanks for the link. With that help, I fixed the bug in the program: > > mov $6., -(sp) > mov $1f, -(sp) > mov $1,-(sp) > mov $0,-(sp) > sys 4 > add $8., sp > mov $0,-(sp) > mov $0,-(sp) > sys 1 > 1: > > > >> Sorry, I typed that in haste without testing. I don’t have a 2.11 system > >> to try it on. However, reading the source code, I did that wrong. The > >> args go on the stack, not in line with the code. > >> mov $6, -(sp) > >> mov a, -(sp) > >> mov $1,-(sp) > >> sys 4 > > > > Without suggesting that every helpful post should be tested, I find the > > superb https://unix50.org web emulator excellent for such things. > > > > Many thanks to the folks hosting & maintaining this great resource! > > > > > > From clemc at ccc.com Fri May 1 08:06:16 2020 From: clemc at ccc.com (Clem Cole) Date: Thu, 30 Apr 2020 18:06:16 -0400 Subject: [TUHS] as(1) on Ultrix-11 vs 2.11BSD In-Reply-To: References: <9A1BF33E-49C9-4712-BF25-4C0BBC504CD1@planet.nl> Message-ID: Alexander, the exit(2) system call takes a parameter, which is an integer status that the process will return. The value 0 is traditionally a successful return, and anything else signifies an error condition. This assembler is the moral equiv of: char hello[] = { "hello world\n" }; main() { write(1, hello, sizeof(hello)); exit(0); } On Thu, Apr 30, 2020 at 5:50 PM Alexander Voropay wrote: > Can anyone please explain the last $0 pushed to the stack ? > Early SysIII ans SYSV on the i386 (and may be on i286) used > similar syscall convention. > > I wrote about this: > https://minnie.tuhs.org/pipermail/tuhs/2019-October/019274.html > https://minnie.tuhs.org/pipermail/tuhs/2019-October/019294.html > > Example: > === > .file "test.s" > .version "02.01" > .set WRITE,4 > .set EXIT,1 > .text > .align 4 > .globl entry > entry: > pushl %ebp > movl %esp,%ebp > subl $8,%esp > > pushl $14 /length > pushl $hello > pushl $1 /STDOUT > pushl $0 > movl $WRITE,%eax > lcall $0x07,$0 > addl $16,%esp > > pushl $0 > movl $EXIT,%eax > lcall 0x07,$0 > > .data > .align 4 > hello: > .byte 0x48,0x65,0x6c,0x6c,0x6f,0x2c, 0x20,0x77,0x6f,0x72 > .byte 0x6c,0x64,0x21,0x0a,0x00 > > ср, 29 апр. 2020 г. в 17:19, : > > > > Thanks for the link. With that help, I fixed the bug in the program: > > > > mov $6., -(sp) > > mov $1f, -(sp) > > mov $1,-(sp) > > mov $0,-(sp) > > sys 4 > > add $8., sp > > mov $0,-(sp) > > mov $0,-(sp) > > sys 1 > > 1: > > > > > > >> Sorry, I typed that in haste without testing. I don’t have a 2.11 > system > > >> to try it on. However, reading the source code, I did that wrong. The > > >> args go on the stack, not in line with the code. > > >> mov $6, -(sp) > > >> mov a, -(sp) > > >> mov $1,-(sp) > > >> sys 4 > > > > > > Without suggesting that every helpful post should be tested, I find the > > > superb https://unix50.org web emulator excellent for such things. > > > > > > Many thanks to the folks hosting & maintaining this great resource! > > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From clemc at ccc.com Fri May 1 08:09:50 2020 From: clemc at ccc.com (Clem Cole) Date: Thu, 30 Apr 2020 18:09:50 -0400 Subject: [TUHS] as(1) on Ultrix-11 vs 2.11BSD In-Reply-To: References: <9A1BF33E-49C9-4712-BF25-4C0BBC504CD1@planet.nl> Message-ID: Ouch - just looked at that more carefully. exit it returning the what was left on the stack. The push $0 is something in the system calling convention for that port. You'll have to look at the kernel sources for that system in code that takes the trap. Clem On Thu, Apr 30, 2020 at 6:06 PM Clem Cole wrote: > Alexander, the exit(2) system call takes a parameter, which is an integer > status that the process will return. The value 0 is traditionally a > successful return, and anything else signifies an error condition. > This assembler is the moral equiv of: > > char hello[] = { "hello world\n" }; > main() { > write(1, hello, sizeof(hello)); > exit(0); > } > > On Thu, Apr 30, 2020 at 5:50 PM Alexander Voropay wrote: > >> Can anyone please explain the last $0 pushed to the stack ? >> Early SysIII ans SYSV on the i386 (and may be on i286) used >> similar syscall convention. >> >> I wrote about this: >> https://minnie.tuhs.org/pipermail/tuhs/2019-October/019274.html >> https://minnie.tuhs.org/pipermail/tuhs/2019-October/019294.html >> >> Example: >> === >> .file "test.s" >> .version "02.01" >> .set WRITE,4 >> .set EXIT,1 >> .text >> .align 4 >> .globl entry >> entry: >> pushl %ebp >> movl %esp,%ebp >> subl $8,%esp >> >> pushl $14 /length >> pushl $hello >> pushl $1 /STDOUT >> pushl $0 >> movl $WRITE,%eax >> lcall $0x07,$0 >> addl $16,%esp >> >> pushl $0 >> movl $EXIT,%eax >> lcall 0x07,$0 >> >> .data >> .align 4 >> hello: >> .byte 0x48,0x65,0x6c,0x6c,0x6f,0x2c, 0x20,0x77,0x6f,0x72 >> .byte 0x6c,0x64,0x21,0x0a,0x00 >> >> ср, 29 апр. 2020 г. в 17:19, : >> > >> > Thanks for the link. With that help, I fixed the bug in the program: >> > >> > mov $6., -(sp) >> > mov $1f, -(sp) >> > mov $1,-(sp) >> > mov $0,-(sp) >> > sys 4 >> > add $8., sp >> > mov $0,-(sp) >> > mov $0,-(sp) >> > sys 1 >> > 1: >> > >> > >> > >> Sorry, I typed that in haste without testing. I don’t have a 2.11 >> system >> > >> to try it on. However, reading the source code, I did that wrong. The >> > >> args go on the stack, not in line with the code. >> > >> mov $6, -(sp) >> > >> mov a, -(sp) >> > >> mov $1,-(sp) >> > >> sys 4 >> > > >> > > Without suggesting that every helpful post should be tested, I find >> the >> > > superb https://unix50.org web emulator excellent for such things. >> > > >> > > Many thanks to the folks hosting & maintaining this great resource! >> > > >> > > >> > >> > >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ron at ronnatalie.com Fri May 1 10:12:50 2020 From: ron at ronnatalie.com (Ronald Natalie) Date: Thu, 30 Apr 2020 20:12:50 -0400 Subject: [TUHS] as(1) on Ultrix-11 vs 2.11BSD In-Reply-To: References: <9A1BF33E-49C9-4712-BF25-4C0BBC504CD1@planet.nl> Message-ID: <820665B9-4D2E-4B52-95BD-F223A4AF9A6A@ronnatalie.com> The syscall skips over a location for reasons not fully clear to me. I guess if you dug down into the libc functions that call it you’d figure out why. As far as the kernel is concerned, it just doesn’t look at it. The zero is just a spacer, other code just does a tst -(sp) there which just decrs the stack poitner. > On Apr 30, 2020, at 5:49 PM, Alexander Voropay wrote: > > Can anyone please explain the last $0 pushed to the stack ? > Early SysIII ans SYSV on the i386 (and may be on i286) used > similar syscall convention. > > I wrote about this: > https://minnie.tuhs.org/pipermail/tuhs/2019-October/019274.html > https://minnie.tuhs.org/pipermail/tuhs/2019-October/019294.html > > Example: > === > .file "test.s" > .version "02.01" > .set WRITE,4 > .set EXIT,1 > .text > .align 4 > .globl entry > entry: > pushl %ebp > movl %esp,%ebp > subl $8,%esp > > pushl $14 /length > pushl $hello > pushl $1 /STDOUT > pushl $0 > movl $WRITE,%eax > lcall $0x07,$0 > addl $16,%esp > > pushl $0 > movl $EXIT,%eax > lcall 0x07,$0 > > .data > .align 4 > hello: > .byte 0x48,0x65,0x6c,0x6c,0x6f,0x2c, 0x20,0x77,0x6f,0x72 > .byte 0x6c,0x64,0x21,0x0a,0x00 > > ср, 29 апр. 2020 г. в 17:19, : >> >> Thanks for the link. With that help, I fixed the bug in the program: >> >> mov $6., -(sp) >> mov $1f, -(sp) >> mov $1,-(sp) >> mov $0,-(sp) >> sys 4 >> add $8., sp >> mov $0,-(sp) >> mov $0,-(sp) >> sys 1 >> 1: >> >> >>>> Sorry, I typed that in haste without testing. I don’t have a 2.11 system >>>> to try it on. However, reading the source code, I did that wrong. The >>>> args go on the stack, not in line with the code. >>>> mov $6, -(sp) >>>> mov a, -(sp) >>>> mov $1,-(sp) >>>> sys 4 >>> >>> Without suggesting that every helpful post should be tested, I find the >>> superb https://unix50.org web emulator excellent for such things. >>> >>> Many thanks to the folks hosting & maintaining this great resource! >>> >>> >> >> From pnr at planet.nl Sat May 2 06:48:16 2020 From: pnr at planet.nl (Paul Ruizendaal) Date: Fri, 1 May 2020 22:48:16 +0200 Subject: [TUHS] SDB debugger Message-ID: <2F4C604D-F01C-4A82-948A-7E77093B48A1@planet.nl> Reading some more stuff about the road from 7th Edition to 8th Edition, this time about debuggers. My current understanding is as follows: - On 6th edition the debugger was ‘cdb’ - On 7th edition it was ‘adb’, a rewrite / evolution from ‘cdb’ - In 32V a new debugger appears, ‘sdb’. Its code seems a derivative from ‘adb’, but the command language is substantially reworked and it uses a modified variant of the a.out linker format - in essence the beginnings of ‘stabs’. Of course the compiler, assembler, linker and related tools all emit/recognize these new symbol table elements. - The July 78 file note by London/Reiser does not mention a reworked debugger at all; the 32V tape that is on TUHS has ’sdb' files that are dated Feb/Mar 1979. This stuff must have been developed between July 78 and March 79. - In the SysIII and 3BSD code on TUHS (from early 80 and late 79 respectively) the stabs format is more developed. For SysIII it is ‘VAX only’. With these roots, it is not surprising that it is also in 8th Edition. Two questions: (1) According to Wikipedia the original author of the stabs format is unknown. It also says that the original author of ‘sdb’ is unknown. Is that correct, is the author really unknown? (2) As far as I can tell, the ’sdb’ debugger was never back ported to 16 bit Unix, not in the SysIII line and not in the 2.xBSD line. It would seem to me that the simple stabs format of 32V would have lent itself to being back ported. Is it correct that no PDP11 Unix used (a simple) stabs tool chain and debugger? From clemc at ccc.com Sat May 2 07:57:39 2020 From: clemc at ccc.com (Clem Cole) Date: Fri, 1 May 2020 17:57:39 -0400 Subject: [TUHS] SDB debugger In-Reply-To: <2F4C604D-F01C-4A82-948A-7E77093B48A1@planet.nl> References: <2F4C604D-F01C-4A82-948A-7E77093B48A1@planet.nl> Message-ID: On Fri, May 1, 2020 at 4:49 PM Paul Ruizendaal wrote: > Reading some more stuff about the road from 7th Edition to 8th Edition, > this time about debuggers. > > My current understanding is as follows: > > - On 6th edition the debugger was ‘cdb’ > One of the early USENIX tapes has a copy of ddt, which used the DEC DDT syntax from the PDP-10 and PDP-11s. I don't remember who created it. Might have been Harvard or Cooper-Union. I'm not sure I ever bothered to learn cdb, as we had ddt at CMU. What I don't remember is if it ran on Fifth Edition. > - On 7th edition it was ‘adb’, a rewrite / evolution from ‘cdb’ > Mumble -- IIRC adb was an attempt at being common between the PDP-11 and the Interdata. Heavy use of the pre-processor. Typedefs did not get exist. Steve Johnson should chime in here. I thought it was a new code base. Again, IIRC ddt did not just recompile on V7 and I needed something fast, as I was trying to write what would become a 68000 backend for the C compiler, so I just learned adb and never looked back until dbx. > > - In 32V a new debugger appears, ‘sdb’. I thought adb was still in 32V also, adb was definitely in the BSD 4.x codebase. > Its code seems a derivative from ‘adb’, but the command language is > substantially reworked and it uses a modified variant of the a.out linker > format - in essence the beginnings of ‘stabs’. Of course the compiler, > assembler, linker and related tools all emit/recognize these new symbol > table elements. > > - The July 78 file note by London/Reiser does not mention a reworked > debugger at all; the 32V tape that is on TUHS has ’sdb' files that are > dated Feb/Mar 1979. This stuff must have been developed between July 78 and > March 79. > > - In the SysIII and 3BSD code on TUHS (from early 80 and late 79 > respectively) the stabs format is more developed. For SysIII it is ‘VAX > only’. With these roots, it is not surprising that it is also in 8th > Edition. > Don't forget Mark Linton's thesis, dbx (which today has become gdb). I thought that was part of the original 4.1 (FastVax) tape; as part of the new compilers from Susan Graham's students. It certainly was part of 4.1c/4,2 as he had left for Stanford by then. [Note to Warren, we should put 4.1 in the browsing tree. The kernel is different enough from 4.0 and does have new utilities, although it was not nearly as different as 4.1c. The reality is until 4.2BSD came out with the networking support, most Vaxen running BSD were 4.1 not 4.0 based. FWIW: Does anyone know if dbx ended up 8 or 9th - Norman/Rob? I also thought it was someone in Graham's team that had added support for long identifiers. Mary Ann did you remember/can you think of who that might have been? But after that work was completed, the updated UCB compilers went back to MH at some point, what changed were folded in I do not know. And of course, Steve started working on his new generation of compilers at USG (PCC2), which would land in the System V stream. > > > Two questions: > > (1) According to Wikipedia the original author of the stabs format is > unknown. It also says that the original author of ‘sdb’ is unknown. Is that > correct, is the author really unknown? > I don't remember, but it is possible this was UCB work that went back to Bell. BTW: wnj started out as a Graham student. His fingers were on a number of things (I think such as the Pascal subsystem in 1 and 2 BSD). It is possible he did something like the long identifiers, I just don't remember, when they went into the system or who did it. I thought that was with the Vax, as the compilers were pressed for code (address) space on the 16-bit systems. > > (2) As far as I can tell, the ’sdb’ debugger was never back ported to 16 > bit Unix, not in the SysIII line and not in the 2.xBSD line. It would seem > to me that the simple stabs format of 32V would have lent itself to being > back ported. Is it correct that no PDP11 Unix used (a simple) stabs tool > chain and debugger? > I don't know who would have done the work, other than someone like Bostic (and he would have only done that if he needed it). When did the 2.xBSD line pick up the long identifiers? Keith must have had something as the Vax code started to assume them pretty soon after the feature was there. FWIW: until the Linton's source debugger, the debugger I remember that most of us used at UCB had been adb, not sdb. The truth is that once the Vaxen showed up at UCB, most of the grad-students like Mary Ann or myself all had accounts on those systems, and things like the Cory Hall 70 were mostly undergrad machines being used for teaching. So most of the new work was being done in the Vax. Bostic got his start as an undergrad moving things back from the Vax to the machine(s) in Cory and Math/Statistics (which were 11's). But the machines being used for new features were definitely the Vax, so all of Graham's students in languages were doing compiler work (or Linton's debugger) was all vax based. -------------- next part -------------- An HTML attachment was scrubbed... URL: From reed at reedmedia.net Sat May 2 09:05:24 2020 From: reed at reedmedia.net (Jeremy C. Reed) Date: Fri, 1 May 2020 18:05:24 -0500 (CDT) Subject: [TUHS] SDB debugger In-Reply-To: <2F4C604D-F01C-4A82-948A-7E77093B48A1@planet.nl> References: <2F4C604D-F01C-4A82-948A-7E77093B48A1@planet.nl> Message-ID: > unknown. It also says that the original author of ?sdb? is unknown. Is > that correct, is the author really unknown? Howard Katseff See 3BSD: usr/doc/sdb/sdbrp.n and https://www.usenix.org/legacy/publications/library/proceedings/cinci93/full_papers/katseff.txt I had also done some interview with him. My Book (wip) says: Katseff, who had already graduated from Berkeley in August 1978, was working at Bell Labs in Holmdel in the group developing 32/V. He started working on sdb, a symbolic debugger for C language programs. It was used for debugging core images from aborted programs. It could report which original source code line caused the error, allowed access to variables, define breakpoints, calling procedures, and could single step on a line by line basis. (Haley and Joy had provided constructive criticism during the sdb development.) % cite: archives/1970s/3bsd/usr/doc/sdb/sdbrp.n % NOTE: Katseff didn't remember if that is for doc or code This debugger was transferred to Berkeley along with the rest of the 32/V operating system.\cite{katseff1} From noel.hunt at gmail.com Sat May 2 10:49:19 2020 From: noel.hunt at gmail.com (Noel Hunt) Date: Sat, 2 May 2020 10:49:19 +1000 Subject: [TUHS] SDB debugger In-Reply-To: <2F4C604D-F01C-4A82-948A-7E77093B48A1@planet.nl> References: <2F4C604D-F01C-4A82-948A-7E77093B48A1@planet.nl> Message-ID: When it comes to Eight Edition, please don't forget Tom Cargill's 'pi'. There was also a version I believe that was used as the debugger for programs on the Blit/Jerq; it seems to be known as '4pi' in the source. On Sat, May 2, 2020 at 6:49 AM Paul Ruizendaal wrote: > Reading some more stuff about the road from 7th Edition to 8th Edition, > this time about debuggers. > > My current understanding is as follows: > > - On 6th edition the debugger was ‘cdb’ > > - On 7th edition it was ‘adb’, a rewrite / evolution from ‘cdb’ > > - In 32V a new debugger appears, ‘sdb’. Its code seems a derivative from > ‘adb’, but the command language is substantially reworked and it uses a > modified variant of the a.out linker format - in essence the beginnings of > ‘stabs’. Of course the compiler, assembler, linker and related tools all > emit/recognize these new symbol table elements. > > - The July 78 file note by London/Reiser does not mention a reworked > debugger at all; the 32V tape that is on TUHS has ’sdb' files that are > dated Feb/Mar 1979. This stuff must have been developed between July 78 and > March 79. > > - In the SysIII and 3BSD code on TUHS (from early 80 and late 79 > respectively) the stabs format is more developed. For SysIII it is ‘VAX > only’. With these roots, it is not surprising that it is also in 8th > Edition. > > > Two questions: > > (1) According to Wikipedia the original author of the stabs format is > unknown. It also says that the original author of ‘sdb’ is unknown. Is that > correct, is the author really unknown? > > (2) As far as I can tell, the ’sdb’ debugger was never back ported to 16 > bit Unix, not in the SysIII line and not in the 2.xBSD line. It would seem > to me that the simple stabs format of 32V would have lent itself to being > back ported. Is it correct that no PDP11 Unix used (a simple) stabs tool > chain and debugger? > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From robpike at gmail.com Sat May 2 11:22:25 2020 From: robpike at gmail.com (Rob Pike) Date: Sat, 2 May 2020 11:22:25 +1000 Subject: [TUHS] SDB debugger In-Reply-To: References: <2F4C604D-F01C-4A82-948A-7E77093B48A1@planet.nl> Message-ID: I don't remember dbx appearing in our lab, but that doesn't mean it wasn't there. I did quite a bit of work on adb, renamed db, mostly finishing things up and fixing a lot of bugs, to make it actually work in Plan 9. I had several conversations with Steve Bourne about it to understand why it seemed broken, and how to fix it. Once fixed, It could do some remarkable stuff but nobody but me seemed to care because it was lower level than cdb/sdb/gdb. I liked it because, once those bugs were fixed, it got the right answer, something gdb never did back then. The [scg]db of yesteryear was far too unreliable and crashy for me. After it dumped core for the nth time on top of the core I was debugging, I gave up on it. But I was never a debugger-first programmer. None of us in the lab were, and that's probably why the debugging setup in Unix is to this day so weak compared to what other systems provide. The sdb/gdb line also had a peculiar property of not answering the question you were asking, although I don't remember the details. It was more interested in the symbols than the code, and that could get in the way. The failure of the compiler to give good symbols didn't help. And now we have DWARF, for which my only comment is: oof, the sound one makes catching a dropped bag of concrete mix. One debugger that we used a lot, although more as a scripting language for things like tracing system calls and checking for malloc leaks than as an interactive tool, was Phil Winterbottom's Acid. It has a crazy language but once you licked it (I think the only three who did were Phil, me, and Russ Cox) it was very powerful. Acme had some front-end code for it that made it great for displaying multithreaded program stacks. Pi was cool, but that was earlier and tied to the Jerq/Blit and C++. -rob On Sat, May 2, 2020 at 10:50 AM Noel Hunt wrote: > When it comes to Eight Edition, please don't forget Tom Cargill's > 'pi'. There was also a version I believe that was used as the > debugger for programs on the Blit/Jerq; it seems to be known as > '4pi' in the source. > > > On Sat, May 2, 2020 at 6:49 AM Paul Ruizendaal wrote: > >> Reading some more stuff about the road from 7th Edition to 8th Edition, >> this time about debuggers. >> >> My current understanding is as follows: >> >> - On 6th edition the debugger was ‘cdb’ >> >> - On 7th edition it was ‘adb’, a rewrite / evolution from ‘cdb’ >> >> - In 32V a new debugger appears, ‘sdb’. Its code seems a derivative from >> ‘adb’, but the command language is substantially reworked and it uses a >> modified variant of the a.out linker format - in essence the beginnings of >> ‘stabs’. Of course the compiler, assembler, linker and related tools all >> emit/recognize these new symbol table elements. >> >> - The July 78 file note by London/Reiser does not mention a reworked >> debugger at all; the 32V tape that is on TUHS has ’sdb' files that are >> dated Feb/Mar 1979. This stuff must have been developed between July 78 and >> March 79. >> >> - In the SysIII and 3BSD code on TUHS (from early 80 and late 79 >> respectively) the stabs format is more developed. For SysIII it is ‘VAX >> only’. With these roots, it is not surprising that it is also in 8th >> Edition. >> >> >> Two questions: >> >> (1) According to Wikipedia the original author of the stabs format is >> unknown. It also says that the original author of ‘sdb’ is unknown. Is that >> correct, is the author really unknown? >> >> (2) As far as I can tell, the ’sdb’ debugger was never back ported to 16 >> bit Unix, not in the SysIII line and not in the 2.xBSD line. It would seem >> to me that the simple stabs format of 32V would have lent itself to being >> back ported. Is it correct that no PDP11 Unix used (a simple) stabs tool >> chain and debugger? >> >> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From doug at cs.dartmouth.edu Sat May 2 12:52:49 2020 From: doug at cs.dartmouth.edu (Doug McIlroy) Date: Fri, 01 May 2020 22:52:49 -0400 Subject: [TUHS] SDB debugger Message-ID: <202005020252.0422qnFL066007@tahoe.cs.Dartmouth.EDU> > Does anyone know if dbx ended up 8 or 9th I believe the only debuggers on research machines were db v1-v6 adb v7,v9,v10 cdb v3-v6 sdb v8-v9 pi v8-v10 Doug From noel.hunt at gmail.com Sat May 2 13:49:24 2020 From: noel.hunt at gmail.com (Noel Hunt) Date: Sat, 2 May 2020 13:49:24 +1000 Subject: [TUHS] SDB debugger In-Reply-To: References: <2F4C604D-F01C-4A82-948A-7E77093B48A1@planet.nl> Message-ID: > Pi was cool, but that was earlier and tied to the Jerq/Blit and C++. I believe Dave Kapilow kept up development of pi at Bell, long after Tom Cargill had left research. There is a CDROM of pi as of about 2002, with versions compilable for various architectures, but the terminal part, which was actually 'pads', originally Blit graphics, was replaced with Openlook. It isn't hard to get pads working with the Plan graphics model. The problem with that distribution is that it is still all stabs-based, and although dwarf is a nightmare, being able to read dwarf symbol tables in pi would be a pleasing step forward. Dealing with dwarf is a struggle, but Russ Cox and Rob Pike have written perhaps the only sane code in the world to deal with the complexities in a nice way. One is advised to avoid 'libdwarf'. On Sat, May 2, 2020 at 11:22 AM Rob Pike wrote: > I don't remember dbx appearing in our lab, but that doesn't mean it wasn't > there. > > I did quite a bit of work on adb, renamed db, mostly finishing things up > and fixing a lot of bugs, to make it actually work in Plan 9. I had several > conversations with Steve Bourne about it to understand why it seemed > broken, and how to fix it. Once fixed, It could do some remarkable stuff > but nobody but me seemed to care because it was lower level than > cdb/sdb/gdb. I liked it because, once those bugs were fixed, it got the > right answer, something gdb never did back then. The [scg]db of yesteryear > was far too unreliable and crashy for me. After it dumped core for the nth > time on top of the core I was debugging, I gave up on it. But I was never a > debugger-first programmer. None of us in the lab were, and that's probably > why the debugging setup in Unix is to this day so weak compared to what > other systems provide. > > The sdb/gdb line also had a peculiar property of not answering the > question you were asking, although I don't remember the details. It was > more interested in the symbols than the code, and that could get in the > way. The failure of the compiler to give good symbols didn't help. And now > we have DWARF, for which my only comment is: oof, the sound one makes > catching a dropped bag of concrete mix. > > One debugger that we used a lot, although more as a scripting language for > things like tracing system calls and checking for malloc leaks than as an > interactive tool, was Phil Winterbottom's Acid. It has a crazy language but > once you licked it (I think the only three who did were Phil, me, and Russ > Cox) it was very powerful. Acme had some front-end code for it that made it > great for displaying multithreaded program stacks. > > Pi was cool, but that was earlier and tied to the Jerq/Blit and C++. > > -rob > > > On Sat, May 2, 2020 at 10:50 AM Noel Hunt wrote: > >> When it comes to Eight Edition, please don't forget Tom Cargill's >> 'pi'. There was also a version I believe that was used as the >> debugger for programs on the Blit/Jerq; it seems to be known as >> '4pi' in the source. >> >> >> On Sat, May 2, 2020 at 6:49 AM Paul Ruizendaal wrote: >> >>> Reading some more stuff about the road from 7th Edition to 8th Edition, >>> this time about debuggers. >>> >>> My current understanding is as follows: >>> >>> - On 6th edition the debugger was ‘cdb’ >>> >>> - On 7th edition it was ‘adb’, a rewrite / evolution from ‘cdb’ >>> >>> - In 32V a new debugger appears, ‘sdb’. Its code seems a derivative from >>> ‘adb’, but the command language is substantially reworked and it uses a >>> modified variant of the a.out linker format - in essence the beginnings of >>> ‘stabs’. Of course the compiler, assembler, linker and related tools all >>> emit/recognize these new symbol table elements. >>> >>> - The July 78 file note by London/Reiser does not mention a reworked >>> debugger at all; the 32V tape that is on TUHS has ’sdb' files that are >>> dated Feb/Mar 1979. This stuff must have been developed between July 78 and >>> March 79. >>> >>> - In the SysIII and 3BSD code on TUHS (from early 80 and late 79 >>> respectively) the stabs format is more developed. For SysIII it is ‘VAX >>> only’. With these roots, it is not surprising that it is also in 8th >>> Edition. >>> >>> >>> Two questions: >>> >>> (1) According to Wikipedia the original author of the stabs format is >>> unknown. It also says that the original author of ‘sdb’ is unknown. Is that >>> correct, is the author really unknown? >>> >>> (2) As far as I can tell, the ’sdb’ debugger was never back ported to 16 >>> bit Unix, not in the SysIII line and not in the 2.xBSD line. It would seem >>> to me that the simple stabs format of 32V would have lent itself to being >>> back ported. Is it correct that no PDP11 Unix used (a simple) stabs tool >>> chain and debugger? >>> >>> >>> >>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From pnr at planet.nl Sat May 2 19:10:48 2020 From: pnr at planet.nl (Paul Ruizendaal) Date: Sat, 2 May 2020 11:10:48 +0200 Subject: [TUHS] SDB debugger In-Reply-To: References: <2F4C604D-F01C-4A82-948A-7E77093B48A1@planet.nl> Message-ID: <8CBDD12A-563F-458D-938B-3AF84D936F39@planet.nl> > On May 1, 2020, at 11:57 PM, Clem Cole wrote: > [Note to Warren, we should put 4.1 in the browsing tree. The kernel is different enough from 4.0 and does have new utilities, although it was not nearly as different as 4.1c. The reality is until 4.2BSD came out with the networking support, most Vaxen running BSD were 4.1 not 4.0 based. It is already there! The entry named "BBN's TCP/IP Code for the VAX” is 4.1BSD with the BBN TCP stack added. The changes to the kernel are just one or two dozen lines (all in sys2.c and main.c) bracketed in #ifdef BBNNET blocks. The actual network stack is in a separate build directory. Maybe the entry should be renamed “4.1BSD with TCP/IP” to make this more clear. -------------- next part -------------- An HTML attachment was scrubbed... URL: From clemc at ccc.com Sun May 3 02:04:09 2020 From: clemc at ccc.com (Clem Cole) Date: Sat, 2 May 2020 12:04:09 -0400 Subject: [TUHS] SDB debugger In-Reply-To: <8CBDD12A-563F-458D-938B-3AF84D936F39@planet.nl> References: <2F4C604D-F01C-4A82-948A-7E77093B48A1@planet.nl> <8CBDD12A-563F-458D-938B-3AF84D936F39@planet.nl> Message-ID: Paul - you are correct that its got a lot of it there and in particular the kernel (which is good). But the /usr directory (and thus /usr/src) is missing. It's not a completely distribution, it's the BBN 4.1 TCP distribution which was a subset. As I said, 4.0 and 4.1 were similar, but different. The trick is find a complete 4.1 distribution tape. On Sat, May 2, 2020 at 5:10 AM Paul Ruizendaal wrote: > > On May 1, 2020, at 11:57 PM, Clem Cole wrote: > > > [Note to Warren, we should put 4.1 in the browsing tree. The kernel is > different enough from 4.0 and does have new utilities, although it was not > nearly as different as 4.1c. The reality is until 4.2BSD came out with the > networking support, most Vaxen running BSD were 4.1 not 4.0 based. > > > It is already there! The entry named "BBN's TCP/IP Code for the VAX” is > 4.1BSD with the BBN TCP stack added. The changes to the kernel are just one > or two dozen lines (all in sys2.c and main.c) bracketed in #ifdef BBNNET > blocks. The actual network stack is in a separate build directory. > > Maybe the entry should be renamed “4.1BSD with TCP/IP” to make this more > clear. > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From lm at mcvoy.com Sun May 3 03:45:18 2020 From: lm at mcvoy.com (Larry McVoy) Date: Sat, 2 May 2020 10:45:18 -0700 Subject: [TUHS] SDB debugger In-Reply-To: <202005020252.0422qnFL066007@tahoe.cs.Dartmouth.EDU> References: <202005020252.0422qnFL066007@tahoe.cs.Dartmouth.EDU> Message-ID: <20200502174518.GC30768@mcvoy.com> So I definitely remember adb, I liked it for the same reasons that Rob did, it told you the truth. I also remember using a sdb debugger, not sure if it was at Sun or when I was doing a Sys V port. I liked it, it was reasonable. I think dbx ended up becoming the one I used on BSD. Gdb eventually got good enough but I'm with Rob, it was a mess early on. But truth be known, I'm sort of a printf() debugger. The main thing I use gdb for is a stack trace, that's usually enough. The BitKeeper source has this "gem": void gdb_backtrace(void) { FILE *f; char *cmd; unless (getenv("_BK_BACKTRACE")) return; unless ((f = efopen("BK_TTYPRINTF")) || (f = fopen(DEV_TTY, "w"))) { f = stderr; } cmd = aprintf("gdb -batch -ex backtrace '%s/bk' %u 1>&%d 2>&%d", bin, getpid(), fileno(f), fileno(f)); system(cmd); free(cmd); if (f != stderr) fclose(f); } On Fri, May 01, 2020 at 10:52:49PM -0400, Doug McIlroy wrote: > > Does anyone know if dbx ended up 8 or 9th > > I believe the only debuggers on research machines were > db v1-v6 > adb v7,v9,v10 > cdb v3-v6 > sdb v8-v9 > pi v8-v10 > > Doug -- --- Larry McVoy lm at mcvoy.com http://www.mcvoy.com/lm From pnr at planet.nl Sun May 3 06:16:06 2020 From: pnr at planet.nl (Paul Ruizendaal) Date: Sat, 2 May 2020 22:16:06 +0200 Subject: [TUHS] SDB debugger In-Reply-To: References: <2F4C604D-F01C-4A82-948A-7E77093B48A1@planet.nl> Message-ID: <21F16C75-62AB-422A-A43F-981407E11434@planet.nl> Thanks, all, for that input. The ddt debugger appears to be a fork of 5th edition cdb. It survived in the Interdata 7/32 port: https://www.tuhs.org/cgi-bin/utree.pl?file=Interdata732/usr/source/chicago It appears to have originated from Bill Allen at the Naval Postgraduate School. Some more reading appears to show a much more gradual development than I first thought. Working along Doug’s list: - First there was db, which claims to be loosely based on DEC’s ODT in its man page. Written in assembler. - Then there is cdb, a rewrite in C, as from 3rd edition. Judging from the man pages, in 3rd and 4th edition it is a mostly incomplete project. - The first reasonably complete version of cdb appears in 5th edition. It only handles “normal” symbols (see below for “normal”). The ddt debugger forks from this version, presumably to fill in some missing features (e.g. access to non-C identifiers, single stepping, etc.; I have not done a full feature comparison). - In 6th edition “pseudo” symbols are introduced: symbols starting with a tilde that provide names for auto and register variables. The cdb debugger is updated to allow references to local variables using a “procname:varname” syntax. The ddt debugger picked this up as well. It is a first step towards the stabs format. I would assume that db and cdb are the work of dmr/ken. - In 7th edition there is the new adb, by Steve Bourne. Main focus of adb appears to have been portability more than major new features. Again, I have not compared the feature sets of ddt and adb to see if there was an influence. - In 32V the symbol format is changed: (i) the “tilde hack” is replaced by a new assembler pseudo op “.stabs”; (ii) this is then used to include more pseudo symbols, for line numbers, for file names, etc.; (iii) the symbol struct is extended with a field to hold the symbol's type. It is essentially stabs, but with 8 char names. A new source level debugger, sdb, allows source level debugging. Its command language is the first to feel like a gdb ancestor. Author Howard Katseff. The dbx debugger appears to stand on the shoulders of sdb, and gdb on the shoulders of dbx. In 8th edition there are 3 debuggers: adb, sdb and pi (for use with the Blit). > On 2 May 2020, at 02:49, Noel Hunt wrote: > > When it comes to Eight Edition, please don't forget Tom Cargill's > 'pi'. There was also a version I believe that was used as the > debugger for programs on the Blit/Jerq; it seems to be known as > '4pi' in the source. > > > On Sat, May 2, 2020 at 6:49 AM Paul Ruizendaal wrote: > Reading some more stuff about the road from 7th Edition to 8th Edition, this time about debuggers. > > My current understanding is as follows: > > - On 6th edition the debugger was ‘cdb’ > > - On 7th edition it was ‘adb’, a rewrite / evolution from ‘cdb’ > > - In 32V a new debugger appears, ‘sdb’. Its code seems a derivative from ‘adb’, but the command language is substantially reworked and it uses a modified variant of the a.out linker format - in essence the beginnings of ‘stabs’. Of course the compiler, assembler, linker and related tools all emit/recognize these new symbol table elements. > > - The July 78 file note by London/Reiser does not mention a reworked debugger at all; the 32V tape that is on TUHS has ’sdb' files that are dated Feb/Mar 1979. This stuff must have been developed between July 78 and March 79. > > - In the SysIII and 3BSD code on TUHS (from early 80 and late 79 respectively) the stabs format is more developed. For SysIII it is ‘VAX only’. With these roots, it is not surprising that it is also in 8th Edition. > > > Two questions: > > (1) According to Wikipedia the original author of the stabs format is unknown. It also says that the original author of ‘sdb’ is unknown. Is that correct, is the author really unknown? > > (2) As far as I can tell, the ’sdb’ debugger was never back ported to 16 bit Unix, not in the SysIII line and not in the 2.xBSD line. It would seem to me that the simple stabs format of 32V would have lent itself to being back ported. Is it correct that no PDP11 Unix used (a simple) stabs tool chain and debugger? > > > From wkt at tuhs.org Sun May 3 10:58:32 2020 From: wkt at tuhs.org (Warren Toomey) Date: Sun, 3 May 2020 10:58:32 +1000 Subject: [TUHS] Test of new mailman software Message-ID: <20200503005832.GA11220@minnie.tuhs.org> All, I upgraded the mailman software on minnie, just want to test that it's working OK. Cheers, Warren From wkt at tuhs.org Sun May 3 11:41:23 2020 From: wkt at tuhs.org (Warren Toomey) Date: Sun, 3 May 2020 11:41:23 +1000 Subject: [TUHS] Fwd: Gnot terminal schematics Message-ID: <20200503014123.GA17619@minnie.tuhs.org> Brantley Coile just asked: Looks like the mailman software works! Say, do you know if there are any copies of the Gnot terminal schematics? Brantley I don't know, does anybody else know? Cheers, Warren From gregg.drwho8 at gmail.com Sun May 3 11:42:48 2020 From: gregg.drwho8 at gmail.com (Gregg Levine) Date: Sat, 2 May 2020 21:42:48 -0400 Subject: [TUHS] Test of new mailman software In-Reply-To: <20200503005832.GA11220@minnie.tuhs.org> References: <20200503005832.GA11220@minnie.tuhs.org> Message-ID: Hello! All is good here, Warren. ----- Gregg C Levine gregg.drwho8 at gmail.com "This signature fought the Time Wars, time and again." On Sat, May 2, 2020 at 8:59 PM Warren Toomey wrote: > > All, I upgraded the mailman software on minnie, just want to test that it's > working OK. > Cheers, Warren From usotsuki at buric.co Sun May 3 11:51:24 2020 From: usotsuki at buric.co (Steve Nickolas) Date: Sat, 2 May 2020 21:51:24 -0400 (EDT) Subject: [TUHS] Test of new mailman software In-Reply-To: <20200503005832.GA11220@minnie.tuhs.org> References: <20200503005832.GA11220@minnie.tuhs.org> Message-ID: On Sun, 3 May 2020, Warren Toomey wrote: > All, I upgraded the mailman software on minnie, just want to test that it's > working OK. > Cheers, Warren > So far so good. -uso. From wkt at tuhs.org Sun May 3 11:52:22 2020 From: wkt at tuhs.org (Warren Toomey) Date: Sun, 3 May 2020 11:52:22 +1000 Subject: [TUHS] Test of new mailman software In-Reply-To: <20200503005832.GA11220@minnie.tuhs.org> References: <20200503005832.GA11220@minnie.tuhs.org> Message-ID: <20200503015222.GA22667@minnie.tuhs.org> On Sun, May 03, 2020 at 10:58:32AM +1000, Warren Toomey wrote: > All, I upgraded the mailman software on minnie, just want to test that it's > working OK. Several people have sent me ACKs, so it looks OK. Thanks all. Warren From norman at oclsc.org Sun May 3 12:21:04 2020 From: norman at oclsc.org (Norman Wilson) Date: Sat, 2 May 2020 22:21:04 -0400 (EDT) Subject: [TUHS] SDB debugger Message-ID: <20200503022104.181B54422F@lignose.oclsc.org> Doug's list is slightly off: adb v7-v10 sdb v8-v10 sdb may actually have been in V7; I'm quite sure it was present in 32/V. But it's not in the V7 manual. adb and sdb were certainly working fine when I arrived in 1127, but they still used ptrace because nobody wanted to touch the code. I used adb quite often (still would were it available in modern worlds!), so I cared enough to take it over, restructuring it quite a bit to make it easier to retarget for different instruction sets and byte orders, and of course to use /proc. I also made some trivial, compatible changes to how numbers were read and printed to conform to Rob's Rule (of which I am also a fan) that what a program presents as output it should also accept as input. sdb I wasn't as fond of, but I did want to get rid of ptrace, so I tinkered it just enough to accomplish that. I do remember clearly celebrating the death of ptrace by removing ptrace(2) from the copy of the V8 manual in the UNIX Room. It took up two pages, and they happened to be facing pages, so I glued them together. I wish it was as easy for others to have such satisfaction these days. Norman Wilson Toronto ON From lm at mcvoy.com Sun May 3 12:41:16 2020 From: lm at mcvoy.com (Larry McVoy) Date: Sat, 2 May 2020 19:41:16 -0700 Subject: [TUHS] SDB debugger In-Reply-To: <20200503022104.181B54422F@lignose.oclsc.org> References: <20200503022104.181B54422F@lignose.oclsc.org> Message-ID: <20200503024116.GE9534@mcvoy.com> On Sat, May 02, 2020 at 10:21:04PM -0400, Norman Wilson wrote: > I wish it was as easy for others to have such > satisfaction these days. Amen to that. I think we all lived, you Bell Labs people especially, in a simpler time. We were trying to fit into 64K, split I/D 128K, my Z80 was 64K but some extra for graphics, then the VAX came and we were trying for 1MB, Suns with 4MB. So small mattered a lot and that meant the Unix philosophy of do one thing and do it well worked quite nicely. What that also meant, to people coming on a little bit after, was that it was relatively easy to modify stuff, the stuff was not that complex. Even I had an easy time, my prime was back at Sun when SunOS was a uniprocessor OS. That is dramatically simpler than a fully threaded SMP OS that has support for TCP offloading, NUMA, etc, etc. I really don't know how systems people do it these days, it is a much more complex world. So I'm with Norm, it was fun back in the day to be able to come in and have a big impact. I too wish that it was as easy for young people to come in and have that impact. I've done that and it was awesome, these days, I have no idea how I'd make a difference. --lm From robpike at gmail.com Sun May 3 13:05:03 2020 From: robpike at gmail.com (Rob Pike) Date: Sun, 3 May 2020 13:05:03 +1000 Subject: [TUHS] SDB debugger In-Reply-To: <20200503022104.181B54422F@lignose.oclsc.org> References: <20200503022104.181B54422F@lignose.oclsc.org> Message-ID: I am happy to learn that there is at least one other person who likes that rule. It's never really caught on, which mystifies me. There is some progress, but people are too enamored by animations using cursor addressing to appreciate the elegance of editable typescripts. -rob -------------- next part -------------- An HTML attachment was scrubbed... URL: From arnold at skeeve.com Sun May 3 16:58:52 2020 From: arnold at skeeve.com (arnold at skeeve.com) Date: Sun, 03 May 2020 00:58:52 -0600 Subject: [TUHS] SDB debugger In-Reply-To: <21F16C75-62AB-422A-A43F-981407E11434@planet.nl> References: <2F4C604D-F01C-4A82-948A-7E77093B48A1@planet.nl> <21F16C75-62AB-422A-A43F-981407E11434@planet.nl> Message-ID: <202005030658.0436wqRf000460@freefriends.org> Paul Ruizendaal wrote: > The dbx debugger appears to stand on the shoulders of sdb, > and gdb on the shoulders of dbx. I think this is a fair statement with respect to command languages and how they work, but probably not w.r.t. shared code bases (a guess, but particularly in the case of gdb I bet that would have been written from scratch.) This only makes sense, too. The user base's first source level debugger would have been sdb. Modelling dbx after sdb (to whatever extent) makes it easier for users to pick it up. Similarly, I remember moving from dbx to gdb; it was straightforward, but also a big improvement in terms of the command line. My two cents :-) Arnold From arnold at skeeve.com Sun May 3 17:14:02 2020 From: arnold at skeeve.com (arnold at skeeve.com) Date: Sun, 03 May 2020 01:14:02 -0600 Subject: [TUHS] SDB debugger In-Reply-To: <20200503024116.GE9534@mcvoy.com> References: <20200503022104.181B54422F@lignose.oclsc.org> <20200503024116.GE9534@mcvoy.com> Message-ID: <202005030714.0437E2xv002086@freefriends.org> Larry McVoy wrote: > I really don't know how systems people do it these days, it is a > much more complex world. > > ... these days, I have no idea how I'd make a difference. By doing something besides systems. There are tons of open source projects at the user level that make a difference. Consider something like AsciiDoc. Less than 10 years old (methinks), and in use for production by at least one major technical publisher that I know of. (And it sure beats the pants off of DocBook XML.) Or the work I do on gawk, Chet on Bash, other GNU bits. There's a whole world out there besides just the kernel. Arnold From clemc at ccc.com Mon May 4 02:13:09 2020 From: clemc at ccc.com (Clem Cole) Date: Sun, 3 May 2020 12:13:09 -0400 Subject: [TUHS] SDB debugger In-Reply-To: <21F16C75-62AB-422A-A43F-981407E11434@planet.nl> References: <2F4C604D-F01C-4A82-948A-7E77093B48A1@planet.nl> <21F16C75-62AB-422A-A43F-981407E11434@planet.nl> Message-ID: On Sat, May 2, 2020 at 4:16 PM Paul Ruizendaal wrote: > The dbx debugger appears to stand on the shoulders of sdb, and gdb on the > shoulders of dbx. > Mumble ... It's true rms started with dbx and peed on it in their usually way - similar to the Gosling EMACS to GnuEMACS story. But Mark wrote DBX from scratch, although I would be surprised if looked at how adb and sdb handled the symbol table and could have lifted that code from their. If I remember discussions with him about it, his interface model was really more VMS debugger more than sdb. As I said, I really don't remember anyone at UCB in those days using sdb. At the time, there was a huge push (mostly from the Stanford crew) to make VMS the Arpa standard system replacing the PDP-10, TOPS, Tenex, ITS, et al. Besides the performance argument (hence the 4.1 FASTVAX work), one of the arguments from the pro-VMS side was the language toolchain, including comments that UNIX did not have a debugger in the same vein as VMS (and also that the Fortran system was considered pretty weak). When he was still at UCB, Mark had tried to make sure dbx worked with both C and Fortran (i.e. at the beginning of the project I did some testing for him because I was working on a large array processor in the CAD group, that needed to compile EE CAD suite which at that point was heavily dominated by Fortran codes). The whole BSD UNIX Fortran was not great and I know when Masscomp built their debugger, they started with dbx and had to gut the multiple language support (thus rewriting much of the Fortran & Pascal support); but the person that did it, had been part of DEC's TLG team previously and had a direct knowledge of how the DEC debugger actually handled multiple languages. BTW the time, I personally did not really care, as long as C support worked. Paul W -- do you remember if DEC TLG did a version of dbx for Ultrix (Leslie might remember)? FWIW: I know that DEC had a number of different debugger projects so on the UNIX side over the years, and I really don't remember what was done for the VAX, as I was not there at the time. By MIPS/Alpha in the mid-late 90's there was a whole new debugger stream that had been developed at part of GEM, but there was another one that came from MIPs too which was based on dbx. Clem -------------- next part -------------- An HTML attachment was scrubbed... URL: From rdm at cfcl.com Mon May 4 02:16:58 2020 From: rdm at cfcl.com (Rich Morin) Date: Sun, 3 May 2020 09:16:58 -0700 Subject: [TUHS] SDB debugger In-Reply-To: <20200502174518.GC30768@mcvoy.com> References: <202005020252.0422qnFL066007@tahoe.cs.Dartmouth.EDU> <20200502174518.GC30768@mcvoy.com> Message-ID: <7A8D26C8-67DB-4676-92D1-7842209F1807@cfcl.com> > On May 2, 2020, at 10:45, Larry McVoy wrote: > > ... But truth be known, I'm sort of a printf() debugger. ... So am I, and ISTR Brian Kernighan and Larry Wall saying that they are, as well. Over the years, I've written some scripts to make this less painful. For example, in Ruby, I use the et() script, as written up in http://wiki.cfcl.com/Projects/Ruby/Phone_Home: l = %w[ context~ dir_path# file_path foo.bar bar[:baz] ] et(l) This produces the following output: some_script:72:in `main' context { :foo => "yada yada yada", ... } file_path "a/b/c" foo.bar :warn bar[:baz] 42 I also have a tiny Elixir script, ii/2, which works nicely in pipelines: ... |> ii(:foo) ... -r From henry.r.bent at gmail.com Mon May 4 02:53:41 2020 From: henry.r.bent at gmail.com (Henry Bent) Date: Sun, 3 May 2020 12:53:41 -0400 Subject: [TUHS] SDB debugger In-Reply-To: References: <2F4C604D-F01C-4A82-948A-7E77093B48A1@planet.nl> <21F16C75-62AB-422A-A43F-981407E11434@planet.nl> Message-ID: On Sun, 3 May 2020 at 12:14, Clem Cole wrote: > > Paul W -- do you remember if DEC TLG did a version of dbx for Ultrix > (Leslie might remember)? FWIW: I know that DEC had a number of different > debugger projects so on the UNIX side over the years, and I really don't > remember what was done for the VAX, as I was not there at the time. By > MIPS/Alpha in the mid-late 90's there was a whole new debugger stream that > had been developed at part of GEM, but there was another one that came from > MIPs too which was based on dbx. > Perhaps unsurprisingly, as Ultrix 1 was basically just 4.2BSD with some tweaks/addons, dbx has been there since the beginning as /usr/ucb/dbx. The binary in 1.1 has SCCS strings mostly dating it to '83 and the 2.0 source tree has dates that are mostly December '84. -Henry -------------- next part -------------- An HTML attachment was scrubbed... URL: From henry.r.bent at gmail.com Mon May 4 03:06:54 2020 From: henry.r.bent at gmail.com (Henry Bent) Date: Sun, 3 May 2020 13:06:54 -0400 Subject: [TUHS] SDB debugger In-Reply-To: References: <2F4C604D-F01C-4A82-948A-7E77093B48A1@planet.nl> <21F16C75-62AB-422A-A43F-981407E11434@planet.nl> Message-ID: On Sun, 3 May 2020 at 12:53, Henry Bent wrote: > On Sun, 3 May 2020 at 12:14, Clem Cole wrote: > >> >> Paul W -- do you remember if DEC TLG did a version of dbx for Ultrix >> (Leslie might remember)? FWIW: I know that DEC had a number of different >> debugger projects so on the UNIX side over the years, and I really don't >> remember what was done for the VAX, as I was not there at the time. By >> MIPS/Alpha in the mid-late 90's there was a whole new debugger stream that >> had been developed at part of GEM, but there was another one that came from >> MIPs too which was based on dbx. >> > > Perhaps unsurprisingly, as Ultrix 1 was basically just 4.2BSD with some > tweaks/addons, dbx has been there since the beginning as /usr/ucb/dbx. The > binary in 1.1 has SCCS strings mostly dating it to '83 and the 2.0 source > tree has dates that are mostly December '84. > Ultrix-32m 1.0 also has dbx, with dates no later than August '83. These dates mostly correspond to the 4.2 source tree on TUHS ( https://minnie.tuhs.org/cgi-bin/utree.pl?file=4.2BSD/usr/src/ucb/dbx ) but not exactly - the 4.2 tree has a newer object.c, implying that DEC was getting early copies of 4.2. The later fixes in Ultrix 1.1 also imply that DEC was getting regular updates from UCB. -Henry -------------- next part -------------- An HTML attachment was scrubbed... URL: From henry.r.bent at gmail.com Mon May 4 03:13:24 2020 From: henry.r.bent at gmail.com (Henry Bent) Date: Sun, 3 May 2020 13:13:24 -0400 Subject: [TUHS] SDB debugger In-Reply-To: References: <2F4C604D-F01C-4A82-948A-7E77093B48A1@planet.nl> <21F16C75-62AB-422A-A43F-981407E11434@planet.nl> Message-ID: On Sun, 3 May 2020 at 12:14, Clem Cole wrote: > > By MIPS/Alpha in the mid-late 90's there was a whole new debugger stream > that had been developed at part of GEM, but there was another one that came > from MIPs too which was based on dbx. > This raises a question I've always had - what was the relationship between DEC's compilers on MIPS/Alpha and the work the MIPS folks did? Early versions of OSF/1 on both platforms have tools that are very, very similar to the MIPS compiler suite - ugen, uopt, two-pass assembler, etc. - and I've always been curious what the heritage was there. -Henry -------------- next part -------------- An HTML attachment was scrubbed... URL: From paul.winalski at gmail.com Mon May 4 03:35:19 2020 From: paul.winalski at gmail.com (Paul Winalski) Date: Sun, 3 May 2020 13:35:19 -0400 Subject: [TUHS] SDB debugger In-Reply-To: References: <2F4C604D-F01C-4A82-948A-7E77093B48A1@planet.nl> <21F16C75-62AB-422A-A43F-981407E11434@planet.nl> Message-ID: On 5/3/20, Clem Cole wrote: > > Paul W -- do you remember if DEC TLG did a version of dbx for Ultrix > (Leslie might remember)? FWIW: I know that DEC had a number of different > debugger projects so on the UNIX side over the years, and I really don't > remember what was done for the VAX, as I was not there at the time. By > MIPS/Alpha in the mid-late 90's there was a whole new debugger stream that > had been developed at part of GEM, but there was another one that came from > MIPs too which was based on dbx. DEC's Technical Languages & Environments (TLE) group was responsible for compilers for BLISS, Fortran, Pascal, PL/I, and Ada, on the languages side, and the Language-Sensitive Editor (LSE) and debugger on the software tools side. All for VMS on the VAX and RSX-11, RT-11, and RSTS on the PDP-11. When VAX Ultrix came along, TLE ported the VMS Fortran compiler and runtime to run on Ultrix. It was a rush-rush project and it was decided that there wasn't enough time to modify the Fortran compiler to produce a.out object files or as assembler files directly. Instead Caroline Davidson and I modified the VMS linker to run on Ultrix and to accept a.out and .a archive files as well as VMS object files, and to produce a.out executables. This linker was called lk. Part of this project was code to translate the debug information in VMS object files to a.out stabs. That work was done by two members of the Ultrix development team (I don't recall their names). So VAX Fortran for Ultrix used the off-the-shelf Ultrix debugger. DEC's MIPS and Alpha Unix offerings used COFF as the object file and executable format, and the GEM common back end emitted COFF-style stabs for those platforms. I was the project leader for the group that developed and maintained the object file emission code in the GEM compiler back end, but Ron Brender, who did a LOT of research on debugging highly-optimized code and parallel programs, maintained the GEM code that generated debug information. I know that all of Ron's work ended up in the VMS debugger. I don't know how much of it went into the Unix side of things. I wasn't paying too much attention at the time, but I believe TLE also did its own debugger for the Alpha Unix platform. John Bishop can tell you more about that. When Microsoft sold Visual Fortran to DEC, TLE took over the toolchain and interactive development environment and released it as Digital Visual Fortran (DVF). When Compaq sold the Alpha technology to Intel, DVF went with it and became Intel Visual Fortran, although the Intel compiler back end was chosen and GEM was abandoned. An Intel debugger, idb, was offered on Linux, although not on Unix IIRC. It was based on the DEC Alpha debugger technology. This was later replaced by a debugger based on gdb. -Paul W. From clemc at ccc.com Mon May 4 06:26:30 2020 From: clemc at ccc.com (Clem Cole) Date: Sun, 3 May 2020 16:26:30 -0400 Subject: [TUHS] SDB debugger In-Reply-To: References: <2F4C604D-F01C-4A82-948A-7E77093B48A1@planet.nl> <21F16C75-62AB-422A-A43F-981407E11434@planet.nl> Message-ID: On Sun, May 3, 2020 at 1:13 PM Henry Bent wrote: > This raises a question I've always had - what was the relationship between > DEC's compilers on MIPS/Alpha and the work the MIPS folks did? Early > versions of OSF/1 on both platforms have tools that are very, very similar > to the MIPS compiler suite - ugen, uopt, two-pass assembler, etc. - and > I've always been curious what the heritage was there. > I can answer that ;-) You need to understand/remember that the first OS that ran on Alpha was a port of the Ultrix/MIPS base. We debugged the hardware with Ultrix (not VMS). The key is that DEC had full rights to all the MIPS tools. So a quick redo of the MIPS/3000 backend was made to emit Alpha since GEM was not really ready yet and Ultrix was way more portable than VMS was. There was a big fight about if Ultrix/Alpha should ship or not, which as we know never happened as Tru64 was to be the new OS base (particularly since DC had taken Mica to Microsoft which became NT etc..). In practice, one of the big problems was that Tru64 was OSF/1 really in name and command system only. The Tru64 team kept rewriting large kernel subsystems under the rules of "this code is not 64-bit clean", or "it's immature, I need to rewrite I can't understand it", "We need better SCSI support," "the TTY driver sucks," *etc*. ... The idea of 'perfection' was very high on people's minds. As I have always said, every one of those choices could be argued as the correct one technically and in the small, but when you integrate against the whole, Tru64 was 3 years late (and DEC had not revenue because other than a little bit of business in system refresh, few people wanted to buy new Vaxen or MIPS boxes -- they went to Sun). So it was actually a bad idea. They should have shipped OSF/1-Alpha as is and then tweaked it to become Tru64 over time. Or they could have shipped the early OSF/1 for Alpha and MIPS together as a stepping stone - the later did actually Shipp under a special license to a few research sites but was never productive. I don't think the former ever left the building. Anyway back to compilers, Tru64 had a 'good enough' compiler based on the MIPS code base to get us all going, but GEM's primary target was VMS since one of the important features of GEM was the VAX->Alpha transpiler technology. VMS was still heavily written in VAX Assembler at the time. Plus, It actually was a little hairy because GEM had a new C/C++ front-end. So TLE's high order bit was VMS for the Alphas. GEM for Tru64 was about 18 months later. This was also a mixed blessing -- one thing was GEM caught a huge number of 64-bit-ism (God Bless Judy Ward's error detection code). Most large ISV's were having big issues with 32-bit dirty code and in particular the ILP32 assumption (all 64-bit UNIX's use the LP64 model). At that point, there were absolutely no tools in the market to help people move to the new 64-bit world. So until the GEM compiler showed up, ISVs were pretty slow in getting their code cleaned. The funny part of all that work is that DEC basically paid the big ISVs (read Oracle et al) to make their code work on later generations of MIP, SUN, and INTEL*64. I know of a number of ISV's that discovered after the Tru64/Alpha port, their bug rate dropped and a whole ton of bugs in the basic codebase had been eliminated. As an aside, to do this day, if I am given an old piece of C or C++ that I want to run on a modern system (like I was a couple of weeks ago to read some old tapes), I fire up my Alpha and feed the sources to Judy'd front-end and listen very carefully to her warnings -- if the GEM Tru64 compiler can accept it without warnings, I have never had a case where the code did not 'just work' when I recompiled for my Mac when I brought it back. -------------- next part -------------- An HTML attachment was scrubbed... URL: From pnr at planet.nl Mon May 4 07:27:49 2020 From: pnr at planet.nl (Paul Ruizendaal) Date: Sun, 3 May 2020 23:27:49 +0200 Subject: [TUHS] SDB debugger In-Reply-To: References: <2F4C604D-F01C-4A82-948A-7E77093B48A1@planet.nl> <21F16C75-62AB-422A-A43F-981407E11434@planet.nl> Message-ID: <9ABCFC34-08D2-4C09-8215-F806F9E09835@planet.nl> > On 3 May 2020, at 18:13, Clem Cole wrote: > > On Sat, May 2, 2020 at 4:16 PM Paul Ruizendaal wrote: > The dbx debugger appears to stand on the shoulders of sdb, and gdb on the shoulders of dbx. > Mumble ... It's true rms started with dbx and peed on it in their usually way - similar to the Gosling EMACS to GnuEMACS story. > But Mark wrote DBX from scratch, although I would be surprised if looked at how adb and sdb handled the symbol table and could have lifted that code from their. If I remember discussions with him about it, his interface model was really more VMS debugger more than sdb. As I said, I really don't remember anyone at UCB in those days using sdb. I meant that in the sense of Feynmann standing on the shoulders of Einstein, in turn standing on the shoulders of Newton. Not in the sense of swiping stuff. Things seem to have evolved quite substantially in the 1979-1982 time frame. With 1979 SDB, a.out and its symbols evolve and gain (C based) type information. In 4BSD, late 1980, it evolves some more and gains long symbol names (i.e. >8 chars). SDB tracks this, but stays basically the same. It would seem that one of the innovations in the first versions of DBX (surviving in 4.1c, end of 1982) was to make use of these long names to store much more detailed type information than the 16-bit field used for this in SDB. SDB seems to have had a short life: in the V8 source on TUHS is a readme saying that it has been deprecated: "sdb is deprecated these days. what's here works, but needs a lot of cleanup. c works reasonably well. f77 works barely, especially in areas near equivalence and common. (f77 needs cleaning up just as badly.)” All that seems to match your recollections. From gdiaz at qswarm.com Mon May 4 22:39:47 2020 From: gdiaz at qswarm.com (gdiaz at qswarm.com) Date: Mon, 04 May 2020 12:39:47 +0000 Subject: [TUHS] sml/nj and unix/plan9 Message-ID: <3740275.SCiKnK0d8l@slayer.slackware.es> hello Was sml/nj part of UNIX at some point? was it considered as a language to use (proof tools may be)? I was wondering if there is any history in common between the two. I've been unable to find anything :-?, please share your stories! :-D Is it true that the language was too slow to be generally useful? There seems to be commentaries along these lines on the internet. thanks! gabi -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 839 bytes Desc: OpenPGP digital signature URL: From treese at acm.org Tue May 5 10:22:44 2020 From: treese at acm.org (Win Treese) Date: Mon, 4 May 2020 20:22:44 -0400 Subject: [TUHS] DEC Compilers (was: Re: SDB debugger In-Reply-To: References: <2F4C604D-F01C-4A82-948A-7E77093B48A1@planet.nl> <21F16C75-62AB-422A-A43F-981407E11434@planet.nl> Message-ID: <8D548BBE-AB7A-457E-87F8-F3718A9AC4B7@acm.org> > On May 3, 2020, at 4:26 PM, Clem Cole wrote: > > Anyway back to compilers, Tru64 had a 'good enough' compiler based on the MIPS code base to get us all going, but GEM's primary target was VMS since one of the important features of GEM was the VAX->Alpha transpiler technology. VMS was still heavily written in VAX Assembler at the time. Plus, It actually was a little hairy because GEM had a new C/C++ front-end. So TLE's high order bit was VMS for the Alphas. GEM for Tru64 was about 18 months later. In the early days of Alpha, I was at DEC’s Cambridge Research Laboratory (directed then by Vic Vyssotsky, having retired from Bell Labs). The lab had various connections to Alpha projects, and we learned that there were (I think) 7 different C compilers running on the early port of Ultrix. That number, I think, did not include the port of gcc that DEC was funding outside the company. Andy Payne, a recent hire at the lab, had been an intern in DEC’s semiconductor group, where he had worked on randomized testing for hardware verification. With all the compilers available, he decided to hack up a program to generate random small C programs with computable expected outputs. His program then compiled the random code with each compiler and tested the result. After finding a number of bugs this way, he got tired of submitting the bug reports, and changed his program to write and submit the bug reports automatically. This caused a little bit of consternation with some of the compiler teams at first. Eventually, this led to some collaboration with the DEC languages and tools team, and Bill McKeeman published a paper that line of work in the Digital Technical Journal in 1998[1]. - Win [1] https://www.hpl.hp.com/hpjournal/dtj/vol10num1/vol10num1art9.pdf From paul.winalski at gmail.com Wed May 6 03:36:40 2020 From: paul.winalski at gmail.com (Paul Winalski) Date: Tue, 5 May 2020 13:36:40 -0400 Subject: [TUHS] DEC Compilers (was: Re: SDB debugger In-Reply-To: <8D548BBE-AB7A-457E-87F8-F3718A9AC4B7@acm.org> References: <2F4C604D-F01C-4A82-948A-7E77093B48A1@planet.nl> <21F16C75-62AB-422A-A43F-981407E11434@planet.nl> <8D548BBE-AB7A-457E-87F8-F3718A9AC4B7@acm.org> Message-ID: On 5/4/20, Win Treese wrote: > > Andy Payne, a recent hire at the lab, had been an intern in DEC’s > semiconductor group, where he had worked on randomized testing for hardware > verification. With all the compilers available, he decided to hack up a > program to generate random small C programs with computable expected > outputs. His program then compiled the random code with each compiler and > tested the result. After finding a number of bugs this way, he got tired of > submitting the bug reports, and changed his program to write and submit the > bug reports automatically. > > This caused a little bit of consternation with some of the compiler teams at > first. I remember that very well. IIRC it was called fuzz testing, and indeed it was controversial, for the reasons Bill McKeeman discusses in his paper. [1] On the one hand, compiler developers said, "nobody would ever write something like that--we can't waste our time on these issues when there are real bugs waiting to be fixed." On the other hand, some of the bugs that fuzz testing turned up provoked reactions such as, "OMG! THAT caused the compiler to crash?" I think the turning point was when fixing one of the fuzz testing bugs also fixed an obscure and hard-to-debug customer problem. Intel's C and Fortran compiler team has also used random testing technology. > [1] https://www.hpl.hp.com/hpjournal/dtj/vol10num1/vol10num1art9.pdf -Paul W. From iain at csp-partnership.co.uk Wed May 6 04:53:59 2020 From: iain at csp-partnership.co.uk (Dr Iain Maoileoin) Date: Tue, 5 May 2020 19:53:59 +0100 Subject: [TUHS] DEC Compilers (was: Re: SDB debugger In-Reply-To: References: <2F4C604D-F01C-4A82-948A-7E77093B48A1@planet.nl> <21F16C75-62AB-422A-A43F-981407E11434@planet.nl> <8D548BBE-AB7A-457E-87F8-F3718A9AC4B7@acm.org> Message-ID: <3C1A26DC-2FC2-4EEA-A7FF-32FFD83E0007@csp-partnership.co.uk> > On 5 May 2020, at 18:36, Paul Winalski wrote: > > On 5/4/20, Win Treese wrote: >> >> …. >> This caused a little bit of consternation with some of the compiler teams at >> first. > > I remember that very well. IIRC it was called fuzz testing, and > …. > compiler team has also used random testing technology. > >> [1] https://www.hpl.hp.com/hpjournal/dtj/vol10num1/vol10num1art9.pdf > > -Paul W. Way back in the deep past - late 70s I think - The Computer Science Department at Strathclyde Uni in Scotland had a contract to develop a test suite generator for the C compiler on the ICL perq computer. I think the testing/development for that compiler was happening at Dalkeith in Scotland - but dont quote me. Like the above we generated programs (e.g. mixing short, int, long signed and unsigned and doing all sort of ops on them). The expected output was computed by the same C program running on a BSD unix vax and something else. We had a few issues with the vax and the other system disagreeing on the arithmetic results, but generally we were confident the random C programs would reasonably test the system under test. We did not get to see the results of the tests, we developed the suite and handed it over to ICL. Overall we were not impressed by the PERQ and on a trip to Rutherford Appleton Labs (RAL) one November there was a HUGE bonfire being prepared (for our Guy Falkes(sp) celebration). The bonfire was generally comprised of the PERQ cardboard packing cases. It just looked like they were planning to burn the PERQs themselves. We agreed with the sentiment. From henry.r.bent at gmail.com Wed May 6 07:49:11 2020 From: henry.r.bent at gmail.com (Henry Bent) Date: Tue, 5 May 2020 17:49:11 -0400 Subject: [TUHS] DEC Compilers (was: Re: SDB debugger In-Reply-To: <8D548BBE-AB7A-457E-87F8-F3718A9AC4B7@acm.org> References: <2F4C604D-F01C-4A82-948A-7E77093B48A1@planet.nl> <21F16C75-62AB-422A-A43F-981407E11434@planet.nl> <8D548BBE-AB7A-457E-87F8-F3718A9AC4B7@acm.org> Message-ID: On Mon, 4 May 2020 at 20:33, Win Treese wrote: > > > On May 3, 2020, at 4:26 PM, Clem Cole wrote: > > > > Anyway back to compilers, Tru64 had a 'good enough' compiler based on > the MIPS code base to get us all going, but GEM's primary target was VMS > since one of the important features of GEM was the VAX->Alpha transpiler > technology. VMS was still heavily written in VAX Assembler at the time. > Plus, It actually was a little hairy because GEM had a new C/C++ > front-end. So TLE's high order bit was VMS for the Alphas. GEM for > Tru64 was about 18 months later. > > In the early days of Alpha, I was at DEC’s Cambridge Research Laboratory > (directed then by Vic Vyssotsky, having retired from Bell Labs). The lab > had various connections to Alpha projects, and we learned that there were > (I think) 7 different C compilers running on the early port of Ultrix. That > number, I think, did not include the port of gcc that DEC was funding > outside the company. > > Andy Payne, a recent hire at the lab, had been an intern in DEC’s > semiconductor group, where he had worked on randomized testing for hardware > verification. With all the compilers available, he decided to hack up a > program to generate random small C programs with computable expected > outputs. His program then compiled the random code with each compiler and > tested the result. After finding a number of bugs this way, he got tired of > submitting the bug reports, and changed his program to write and submit the > bug reports automatically. > > This caused a little bit of consternation with some of the compiler teams > at first. > > Eventually, this led to some collaboration with the DEC languages and > tools team, and Bill McKeeman published a paper that line of work in the > Digital Technical Journal in 1998[1]. > > - Win > > [1] https://www.hpl.hp.com/hpjournal/dtj/vol10num1/vol10num1art9.pdf Does this software still exist anywhere? The link to the download is long gone, archive.org did not preserve the download, and I had no success finding the files on the web. -Henry -------------- next part -------------- An HTML attachment was scrubbed... URL: From crossd at gmail.com Wed May 6 07:59:22 2020 From: crossd at gmail.com (Dan Cross) Date: Tue, 5 May 2020 17:59:22 -0400 Subject: [TUHS] DEC Compilers (was: Re: SDB debugger In-Reply-To: References: <2F4C604D-F01C-4A82-948A-7E77093B48A1@planet.nl> <21F16C75-62AB-422A-A43F-981407E11434@planet.nl> <8D548BBE-AB7A-457E-87F8-F3718A9AC4B7@acm.org> Message-ID: On Tue, May 5, 2020 at 1:37 PM Paul Winalski wrote: > On 5/4/20, Win Treese wrote: > > > > Andy Payne, a recent hire at the lab, had been an intern in DEC’s > > semiconductor group, where he had worked on randomized testing for > hardware > > verification. With all the compilers available, he decided to hack up a > > program to generate random small C programs with computable expected > > outputs. His program then compiled the random code with each compiler and > > tested the result. After finding a number of bugs this way, he got tired > of > > submitting the bug reports, and changed his program to write and submit > the > > bug reports automatically. > > > > This caused a little bit of consternation with some of the compiler > teams at > > first. > > I remember that very well. IIRC it was called fuzz testing, and > indeed it was controversial, for the reasons Bill McKeeman discusses > in his paper. [1] On the one hand, compiler developers said, "nobody > would ever write something like that--we can't waste our time on these > issues when there are real bugs waiting to be fixed." On the other > hand, some of the bugs that fuzz testing turned up provoked reactions > such as, "OMG! THAT caused the compiler to crash?" I think the > turning point was when fixing one of the fuzz testing bugs also fixed > an obscure and hard-to-debug customer problem. Intel's C and Fortran > compiler team has also used random testing technology. > Ah, very cool. The same approach has come into favor again recently. I've dealt personally with https://github.com/google/syzkaller, which is a kernel fuzzer that generates random inputs to system calls and detects e.g. panics. It's a neat approach. - Dan C. > [1] https://www.hpl.hp.com/hpjournal/dtj/vol10num1/vol10num1art9.pdf > > -Paul W. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From doug at cs.dartmouth.edu Wed May 6 12:52:46 2020 From: doug at cs.dartmouth.edu (Doug McIlroy) Date: Tue, 05 May 2020 22:52:46 -0400 Subject: [TUHS] DEC Compilers (was: Re: SDB debugger Message-ID: <202005060252.0462qkWH077588@tahoe.cs.Dartmouth.EDU> > random small C programs with computable expected outputs "computable" is subtle here. The only way to compute the outputs was to run the program. McKeeman's trick was to sic several completely unrelated compilers on the program and let them vote on the answer. Compile time was measured. My favorite "bug" was the mmany minutes it took to compile a constant expression that involved shifting a constant INT_MAX bits by performing that many 1-bit shifts. Doug From doug at cs.dartmouth.edu Thu May 7 01:53:50 2020 From: doug at cs.dartmouth.edu (Doug McIlroy) Date: Wed, 06 May 2020 11:53:50 -0400 Subject: [TUHS] DEC Compilers (was: Re: SDB debugger Message-ID: <202005061553.046FroWn099876@tahoe.cs.Dartmouth.EDU> >> Compile time was measured. My favorite "bug" was the >> many minutes it took to compile a constant expression >> that involved shifting a constant INT_MAX bits by >> performing that many 1-bit shifts. > > I don't know if this anecdote is an urban legend or if it really > happened. I was told [a similar] story when I was interning as an operator > at my alma mater, which was an IBM System/360 shop. I heard it not from the grapevine, but from McKeeman himself. Doug From tuhs at tkr.bondplaza.com Thu May 7 06:21:08 2020 From: tuhs at tkr.bondplaza.com (Tim Rylance) Date: Wed, 6 May 2020 21:21:08 +0100 Subject: [TUHS] DEC Compilers (was: Re: SDB debugger In-Reply-To: <202005061553.046FroWn099876@tahoe.cs.Dartmouth.EDU> References: <202005061553.046FroWn099876@tahoe.cs.Dartmouth.EDU> Message-ID: <108B38F3-B0E0-46D3-A78F-0A7CE57C6EEB@tkr.bondplaza.com> >>> Compile time was measured. My favorite "bug" was the >>> many minutes it took to compile a constant expression >>> that involved shifting a constant INT_MAX bits by >>> performing that many 1-bit shifts. >> >> I don't know if this anecdote is an urban legend or if it really >> happened. I was told [a similar] story when I was interning as an operator >> at my alma mater, which was an IBM System/360 shop. > > I heard it not from the grapevine, but from McKeeman himself. It’s mentioned in the paper (https://www.hpl.hp.com/hpjournal/dtj/vol10num1/vol10num1art9.pdf ) on page 105, table 1 Results of Testing C Compilers Source Code Resulting Problem 1>>INT_MAX Twenty-minute compile time but not explained. My favourite is int(…(x)…) enough nested parentheses to kill the compiler Spurious diagnostic (10 parentheses) Compiler crash (100 parentheses) Server crash (10,000 parentheses) explained on page 104: … the server crash occurred when the tested compiler got a stack overflow on a heavily loaded machine with a very large memory. The operating system attempted to dump a gigabyte of compiler stack, which caused all the other active users to thrash, and many of them also dumped for lack of memory. The many disk drives on the server began a dance of the lights that sopped up the remaining free resources, causing the operators to boot the server to recover. Excellent testing can make you unpopular with almost everyone. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ark-mlist at comcast.net Fri May 8 00:58:08 2020 From: ark-mlist at comcast.net (Andrew Koenig) Date: Thu, 7 May 2020 10:58:08 -0400 Subject: [TUHS] sml/nj and unix/plan9 In-Reply-To: <3740275.SCiKnK0d8l@slayer.slackware.es> References: <3740275.SCiKnK0d8l@slayer.slackware.es> Message-ID: <00aa01d6247f$f1cb08f0$d5611ad0$@comcast.net> > Was sml/nj part of UNIX at some point? was it considered as a language to use > (proof tools may be)? > > I was wondering if there is any history in common between the two. I've been > unable to find anything :-?, please share your stories! :-D > > Is it true that the language was too slow to be generally useful? There seems to be > commentaries along these lines on the internet. To my knowledge, sml/nj was never part of the Unix distribution, though it was definitely available thereon (and also on SunOS). One of the main people behind SML/NJ was Dave MacQueen, who was in the same general organization as the Unix people. As for sml/nj being too slow to be generally useful, Rob Pike (I think) once wrote a desk-calculator program in C. I took that program and rewrote it in sml/nj. Compared to the C version, it ran about twice as slowly and the source code was about half the size. So no, I don't think sml/nj was slow. From crossd at gmail.com Fri May 8 03:40:26 2020 From: crossd at gmail.com (Dan Cross) Date: Thu, 7 May 2020 13:40:26 -0400 Subject: [TUHS] sml/nj and unix/plan9 In-Reply-To: <3740275.SCiKnK0d8l@slayer.slackware.es> References: <3740275.SCiKnK0d8l@slayer.slackware.es> Message-ID: On Mon, May 4, 2020 at 8:49 AM wrote: > Was sml/nj part of UNIX at some point? was it considered as a language to > use > (proof tools may be)? > > I was wondering if there is any history in common between the two. I've > been > unable to find anything :-?, please share your stories! :-D > There was certainly proximity, if not a direct connection. Is it true that the language was too slow to be generally useful? There > seems > to be commentaries along these lines on the internet. > This question is difficult to answer. As a _langage_ there's little that makes SML inherently slow; the MLton compiler does full-program optimization with advanced optimizations and generates code that's pretty performant. There are certainly other SML implementations that generate slow code; MoscowML comes to mind: it generates a byte code that's not known for speed. SML/NJ is pretty zippy, but I've never tried to write anything performance-critical with it. - Dan C. -------------- next part -------------- An HTML attachment was scrubbed... URL: From a.phillip.garcia at gmail.com Fri May 8 06:47:30 2020 From: a.phillip.garcia at gmail.com (A. P. Garcia) Date: Thu, 7 May 2020 16:47:30 -0400 Subject: [TUHS] sml/nj and unix/plan9 In-Reply-To: <3740275.SCiKnK0d8l@slayer.slackware.es> References: <3740275.SCiKnK0d8l@slayer.slackware.es> Message-ID: On Mon, May 4, 2020, 8:48 AM wrote: > hello > > Was sml/nj part of UNIX at some point? was it considered as a language to > use > (proof tools may be)? > > I was wondering if there is any history in common between the two. I've > been > unable to find anything :-?, please share your stories! :-D > > Is it true that the language was too slow to be generally useful? There > seems > to be commentaries along these lines on the internet. > > thanks! > gabi > > > > > If you Google Unix ML, there are two fairly sizable papers on the topic > near the top of the results... -------------- next part -------------- An HTML attachment was scrubbed... URL: From doug at cs.dartmouth.edu Sat May 9 05:58:33 2020 From: doug at cs.dartmouth.edu (Doug McIlroy) Date: Fri, 08 May 2020 15:58:33 -0400 Subject: [TUHS] uhs@tuhs.org Message-ID: <202005081958.048JwXt8062283@tahoe.cs.Dartmouth.EDU> Subject: Re: [TUHS] sml/nj and unix/plan9 sml was in the the tenth edition. I used it a bit. I didn't find it unreasonably slow. Doug From clemc at ccc.com Mon May 11 07:40:53 2020 From: clemc at ccc.com (Clem Cole) Date: Sun, 10 May 2020 17:40:53 -0400 Subject: [TUHS] Fwd: Great deal on Unix! / Boing Boing In-Reply-To: <72BC2DC0-3C89-430C-89F2-29101E9A42AF@acm.org> References: <72BC2DC0-3C89-430C-89F2-29101E9A42AF@acm.org> Message-ID: I felt that I had to pass this along... https://boingboing.net/2020/05/10/great-deal-on-unix.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From robpike at gmail.com Mon May 11 10:32:01 2020 From: robpike at gmail.com (Rob Pike) Date: Mon, 11 May 2020 10:32:01 +1000 Subject: [TUHS] v7 K&R C In-Reply-To: <3cb1126796176debe28aa66672ba27ae@yaccman.com> References: <6D6EFA0C-36C3-4225-A331-D1998A07C50A@gmail.com> <3cb1126796176debe28aa66672ba27ae@yaccman.com> Message-ID: Interesting that Go had only what you call "typed typdefs" until we needed to add "untyped typedefs" so we could provide aliasing for forwarding declarations. And that necessity made me unhappy. But the short version: Go went the other way with what "typedef" means. -rob On Mon, May 11, 2020 at 10:28 AM wrote: > Following up on Rob's comment, I always took the point of view that Dennis > owned the C description, and what he said goes. Not that I didn't make > suggestions that he accepted. One of the better ones (actually in B) was ^ > for exclusive OR. One of the worse ones was the syntax for casts. We > looked at about 5 different ideas and hated all of them. And most of them > couldn't be easily compiled with Yacc. So I took the grammar for > declarations, removed the variable name, and voila, it expressed everything > we wanted in the way of semantics, had a simple rule of construction, and > we badly needed the functionality for the Interdata port. I quickly came > to hate it, though -- the casts we were using looked like a teletype threw > up in the middle of the code. > > With respect to enums, there is a feature I've wanted for years: a typed > typedef. Saying typetdef int foo would make foo an integer, but if you > passed an ordinary int to something declared as foo it would be an error. > Even if it was an integer constant unless cast. > > The amount of mechanism required to get that behavior from both C and C++ > is horrible, so far as I know, although C++ has accreted so much stuff > maybe it's there now... > > Steve > --- > > > > On 2020-04-24 19:54, Rob Pike wrote: > > Another debate at the time was caused by a disagreement between pcc and cc > regarding enums: are they a type or just a way to declare constant? I > remember getting annoyed by pcc not letting me declare a constant with an > enum and use it as an int. I protested to scj and dmr and after some to-ing > and fro-ing Steve changed pcc to treat them as constants. > > Not sure it was the right decision, but C desperately wanted a non-macro > way to define a constant. I'd probably argue the same way today. The real > lesson is how propinquity affects progress. > > -rbo > > > On Sat, Apr 25, 2020 at 12:51 PM Rob Pike wrote: > > The ability to call a function pointer fp with the syntax fp() rather than > (*fp)() came rather late, I think at Bjarne's suggestion or example. Pretty > sure it was not in v7 C, as you observe. > > Convenient though the shorthand may be, it always bothered me as > inconsistent and misleading. (I am pretty sure I used it sometimes > regardless.) > > -rob > > > On Sat, Apr 25, 2020 at 12:48 PM Adam Thornton > wrote: > > > > On Apr 24, 2020, at 7:37 PM, Charles Anthony > wrote: > > > > On Fri, Apr 24, 2020 at 7:00 PM Adam Thornton wrote: > > This doesn't like the function pointer. > > > $ cc -c choparg.c > choparg.c:11: Call of non-function > > > Perhaps: > > (*fcn)(arg); > > > We have a winner! > > Also, Kartik, dunno where it is on the net, but if you install a v7 > system, /usr/src/cmd/c > > Adam > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From scj at yaccman.com Mon May 11 10:28:04 2020 From: scj at yaccman.com (scj at yaccman.com) Date: Sun, 10 May 2020 17:28:04 -0700 Subject: [TUHS] v7 K&R C In-Reply-To: References: <6D6EFA0C-36C3-4225-A331-D1998A07C50A@gmail.com> Message-ID: <3cb1126796176debe28aa66672ba27ae@yaccman.com> Following up on Rob's comment, I always took the point of view that Dennis owned the C description, and what he said goes. Not that I didn't make suggestions that he accepted. One of the better ones (actually in B) was ^ for exclusive OR. One of the worse ones was the syntax for casts. We looked at about 5 different ideas and hated all of them. And most of them couldn't be easily compiled with Yacc. So I took the grammar for declarations, removed the variable name, and voila, it expressed everything we wanted in the way of semantics, had a simple rule of construction, and we badly needed the functionality for the Interdata port. I quickly came to hate it, though -- the casts we were using looked like a teletype threw up in the middle of the code. With respect to enums, there is a feature I've wanted for years: a typed typedef. Saying typetdef int foo would make foo an integer, but if you passed an ordinary int to something declared as foo it would be an error. Even if it was an integer constant unless cast. The amount of mechanism required to get that behavior from both C and C++ is horrible, so far as I know, although C++ has accreted so much stuff maybe it's there now... Steve --- On 2020-04-24 19:54, Rob Pike wrote: > Another debate at the time was caused by a disagreement between pcc and cc regarding enums: are they a type or just a way to declare constant? I remember getting annoyed by pcc not letting me declare a constant with an enum and use it as an int. I protested to scj and dmr and after some to-ing and fro-ing Steve changed pcc to treat them as constants. > > Not sure it was the right decision, but C desperately wanted a non-macro way to define a constant. I'd probably argue the same way today. The real lesson is how propinquity affects progress. > > -rbo > > On Sat, Apr 25, 2020 at 12:51 PM Rob Pike wrote: > The ability to call a function pointer fp with the syntax fp() rather than (*fp)() came rather late, I think at Bjarne's suggestion or example. Pretty sure it was not in v7 C, as you observe. > > Convenient though the shorthand may be, it always bothered me as inconsistent and misleading. (I am pretty sure I used it sometimes regardless.) > > -rob > > On Sat, Apr 25, 2020 at 12:48 PM Adam Thornton wrote: > > On Apr 24, 2020, at 7:37 PM, Charles Anthony wrote: > > On Fri, Apr 24, 2020 at 7:00 PM Adam Thornton wrote: > This doesn't like the function pointer. > > $ cc -c choparg.c > choparg.c:11: Call of non-function > > Perhaps: > > (*fcn)(arg); We have a winner! Also, Kartik, dunno where it is on the net, but if you install a v7 system, /usr/src/cmd/c Adam -------------- next part -------------- An HTML attachment was scrubbed... URL: From scj at yaccman.com Mon May 11 10:36:56 2020 From: scj at yaccman.com (scj at yaccman.com) Date: Sun, 10 May 2020 17:36:56 -0700 Subject: [TUHS] Bell Labs recruiter In-Reply-To: References: Message-ID: I don't know the recruiter, but I remember Ken's wife mock-complaining "I went clear across the country to get away from home, married Ken, and he brought me back here two miles from my mother..." Ask Rob Pike about his recruitment (I was the technical contact, but it's his story to tell...) Steve --- On 2020-04-14 23:46, Efton Collins wrote: > I was lucky enough to be in the room last year at VCF East when Ken > told the story of how the move from Berkeley to Bell Labs happened. > Ken's description of his interactions with the Bell recruiter was > entertaining and made clear that persistent effort was needed to get > him to come out to New Jersey and meet some of the people there. > > Does anyone know who the recruiter was? From lm at mcvoy.com Mon May 11 10:57:46 2020 From: lm at mcvoy.com (Larry McVoy) Date: Sun, 10 May 2020 17:57:46 -0700 Subject: [TUHS] v7 K&R C In-Reply-To: References: <6D6EFA0C-36C3-4225-A331-D1998A07C50A@gmail.com> <3cb1126796176debe28aa66672ba27ae@yaccman.com> Message-ID: <20200511005745.GL17035@mcvoy.com> My mail is screwed up, I see Rob's reply to Steve but didn't see Steve's original. > On Mon, May 11, 2020 at 10:28 AM wrote: > > With respect to enums, there is a feature I've wanted for years: a typed > > typedef. Saying typetdef int foo would make foo an integer, but if you > > passed an ordinary int to something declared as foo it would be an error. > > Even if it was an integer constant unless cast. Steve, I couldn't agree more, you are 100% right, this is how it should work. I wanted to like enums because I naively thought they'd have these semantics but then learned they really aren't any different than a well managed list of #defines. IMHO, without your semantics, enums are pretty useless, #define is good enough and more clear. --lm From scj at yaccman.com Mon May 11 11:00:27 2020 From: scj at yaccman.com (scj at yaccman.com) Date: Sun, 10 May 2020 18:00:27 -0700 Subject: [TUHS] Question: stdio - Who invented and ... In-Reply-To: References: <202003231032.02NAWY4v022713@freefriends.org> Message-ID: <94b40dae0dba9d37b2de778b2d32e80a@yaccman.com> I can't help with enum. It sticks in my mind that some other language had something similar but details are gone. However, void was more interesting. The original default return type for both B and C was int. As part of doing lint, I started wondering whether it would be good to have a message "function returns a value that is unused". In some cases, this would be a really useful message. That feature never made it into lint, but the idea persisted, and Dennis and I discussed void as a keyword to say "I'm not returning anything". It became useful as a message if you try to return a value from a function declared as void. But the real brilliance, to my mind, was the invention of void *. Larry Rosler and I were at a Usenix meeting together, and after a bit of alcoholic lubrication, I started complaining about trying to use malloc and also have strong typing. Larry suddenly said -- "we need a pointer that can't be indirected through, but can be assigned to any other pointer!" and after a minute or so he said "void *. We should call it void * ." My memory is that Dennis was instantly enthusiastic and it was in PCC a day or two later. Steve --- On 2020-03-23 19:02, Greg A. Woods wrote: > At Mon, 23 Mar 2020 09:46:52 -0400, Clem Cole wrote: > Subject: Re: [TUHS] Question: stdio - Who invented and ... >> >> I've forgotten when 'enum' and 'void' got added (which are not in the >> white >> book - Steve Johnson or Doug may remember). But, I think they were >> in the >> V7 compiler, and not Typesetter C. > > Since I was recently researching these myself: > > There was an extra page in the 7th Edition manual titled "Recent > Changes > to C" which described both structure assignment and the enumeration > type: > > https://www.bell-labs.com/usr/dmr/www/cchanges.pdf > > This paper appears in the UNIX System III "The C Programming Language > Reference Manual", but there's no mention of "void" in that manual. On > the other hand the UNIX System III PDP-11 compiler mentions "void" > (1980). > > I don't see any mention of "void" in 7th Edition sources. However the > version of 'awk' on the v7addenda tape from "12/2/80" has one "(void)" > cast. The only mention of "void" in dmr's "The Development of the C > Language" paper (from HOPL-II, 1993) seems to be in the > "Standardization" section where it's mentioned that it's not described > in the first edition of K&R. There's mention in the CSTR#102 paper > from > Sept. 1981 of the "void" type. The 2.9BSD code uses "void", but the > sources I have don't include a copy of the compiler. > > -- > Greg A. Woods > > Kelowna, BC +1 250 762-7675 RoboHack > Planix, Inc. Avoncote Farms From stewart at serissa.com Mon May 11 12:08:58 2020 From: stewart at serissa.com (Lawrence Stewart) Date: Sun, 10 May 2020 22:08:58 -0400 Subject: [TUHS] v7 K&R C In-Reply-To: <3cb1126796176debe28aa66672ba27ae@yaccman.com> References: <6D6EFA0C-36C3-4225-A331-D1998A07C50A@gmail.com> <3cb1126796176debe28aa66672ba27ae@yaccman.com> Message-ID: If I remember correctly, enums in Mesa (the PARC Pascal like system language) had typed enums. The 1979 version of the language manual at http://www.bitsavers.org/pdf/xerox/parc/techReports/CSL-79-3_Mesa_Language_Manual_Version_5.0.pdf says so anyway. -L PS The niftiest use of #define I know about was at the short lived supercomputer company SiCortex around 2005. Wilson Snyder (verilator fame) wrote a thing that extracted all the constants and register definitions from the CPU chip spec and output them as #define equivalents in 5 different languages. PPS Thank you for ‘^' > On 2020, May 10, at 8:28 PM, scj at yaccman.com wrote: > > Following up on Rob's comment, I always took the point of view that Dennis owned the C description, and what he said goes. Not that I didn't make suggestions that he accepted. One of the better ones (actually in B) was ^ for exclusive OR. One of the worse ones was the syntax for casts. We looked at about 5 different ideas and hated all of them. And most of them couldn't be easily compiled with Yacc. So I took the grammar for declarations, removed the variable name, and voila, it expressed everything we wanted in the way of semantics, had a simple rule of construction, and we badly needed the functionality for the Interdata port. I quickly came to hate it, though -- the casts we were using looked like a teletype threw up in the middle of the code. > > With respect to enums, there is a feature I've wanted for years: a typed typedef. Saying typetdef int foo would make foo an integer, but if you passed an ordinary int to something declared as foo it would be an error. Even if it was an integer constant unless cast. > > The amount of mechanism required to get that behavior from both C and C++ is horrible, so far as I know, although C++ has accreted so much stuff maybe it's there now... > > Steve > > --- > > > > On 2020-04-24 19:54, Rob Pike wrote: > >> Another debate at the time was caused by a disagreement between pcc and cc regarding enums: are they a type or just a way to declare constant? I remember getting annoyed by pcc not letting me declare a constant with an enum and use it as an int. I protested to scj and dmr and after some to-ing and fro-ing Steve changed pcc to treat them as constants. >> >> Not sure it was the right decision, but C desperately wanted a non-macro way to define a constant. I'd probably argue the same way today. The real lesson is how propinquity affects progress. >> >> -rbo >> >> >> On Sat, Apr 25, 2020 at 12:51 PM Rob Pike > wrote: >> The ability to call a function pointer fp with the syntax fp() rather than (*fp)() came rather late, I think at Bjarne's suggestion or example. Pretty sure it was not in v7 C, as you observe. >> >> Convenient though the shorthand may be, it always bothered me as inconsistent and misleading. (I am pretty sure I used it sometimes regardless.) >> >> -rob >> >> >> On Sat, Apr 25, 2020 at 12:48 PM Adam Thornton > wrote: >> >> >>> On Apr 24, 2020, at 7:37 PM, Charles Anthony > wrote: >>> >>> >>> >>> On Fri, Apr 24, 2020 at 7:00 PM Adam Thornton > wrote: >>> This doesn't like the function pointer. >>> >>> $ cc -c choparg.c >>> choparg.c:11: Call of non-function >>> >>> Perhaps: >>> >>> (*fcn)(arg); >>> >> >> We have a winner! >> >> Also, Kartik, dunno where it is on the net, but if you install a v7 system, /usr/src/cmd/c >> >> Adam -------------- next part -------------- An HTML attachment was scrubbed... URL: From michael at kjorling.se Mon May 11 21:36:39 2020 From: michael at kjorling.se (Michael =?utf-8?B?S2rDtnJsaW5n?=) Date: Mon, 11 May 2020 11:36:39 +0000 Subject: [TUHS] v7 K&R C In-Reply-To: <3cb1126796176debe28aa66672ba27ae@yaccman.com> References: <6D6EFA0C-36C3-4225-A331-D1998A07C50A@gmail.com> <3cb1126796176debe28aa66672ba27ae@yaccman.com> Message-ID: On 10 May 2020 17:28 -0700, from scj at yaccman.com: > With respect to enums, there is a feature I've wanted for years: a typed > typedef. Saying typetdef int foo would make foo an integer, but if you > passed an ordinary int to something declared as foo it would be an > error. Even if it was an integer constant unless cast. Isn't that at least pretty close to how Ada does it? -- Michael Kjörling • https://michael.kjorling.se • michael at kjorling.se “Remember when, on the Internet, nobody cared that you were a dog?” From woods at robohack.ca Tue May 12 03:32:14 2020 From: woods at robohack.ca (Greg A. Woods) Date: Mon, 11 May 2020 10:32:14 -0700 Subject: [TUHS] v7 K&R C In-Reply-To: <20200511005745.GL17035@mcvoy.com> References: <6D6EFA0C-36C3-4225-A331-D1998A07C50A@gmail.com> <3cb1126796176debe28aa66672ba27ae@yaccman.com> <20200511005745.GL17035@mcvoy.com> Message-ID: At Sun, 10 May 2020 17:57:46 -0700, Larry McVoy wrote: Subject: Re: [TUHS] v7 K&R C > > > On Mon, May 11, 2020 at 10:28 AM wrote: > > > With respect to enums, there is a feature I've wanted for years: a typed > > > typedef. Saying typetdef int foo would make foo an integer, but if you > > > passed an ordinary int to something declared as foo it would be an error. > > > Even if it was an integer constant unless cast. > > Steve, I couldn't agree more, you are 100% right, this is how it should > work. I wanted to like enums because I naively thought they'd have these > semantics but then learned they really aren't any different than a well > managed list of #defines. Absolutely agreed! The lameness of typedef (and in how enum is related to typedef) is one of the saddest parts of C. (The other is the default promotion to int.) It would be trivial to fix too -- for a "new" C, that is. Making it backward compatible for legacy code would be tough, even with tooling to help fix the worst issues. I've seen far too much code that would be hard to fix by hand, e.g. some that even goes so far as to assume things like arithmetic on enum values will produce other valid enum values. Ideally enums could be a value in any native type, including float/double. > IMHO, without your semantics, enums are pretty useless, #define is good > enough and more clear. Actually that's no longer true with a good modern toolchain, especially with respect to the debugger. A good debugger can now show the enum symbol for a (matching) value of a properly typedefed variable. (In fact I never thouth that a #define macro was more clear, even before debugger support -- the debugger support just gave me a better excuse to use to explain my preference!) -- Greg A. Woods Kelowna, BC +1 250 762-7675 RoboHack Planix, Inc. Avoncote Farms -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 195 bytes Desc: OpenPGP Digital Signature URL: From paul.winalski at gmail.com Tue May 12 04:25:15 2020 From: paul.winalski at gmail.com (Paul Winalski) Date: Mon, 11 May 2020 14:25:15 -0400 Subject: [TUHS] v7 K&R C In-Reply-To: References: <6D6EFA0C-36C3-4225-A331-D1998A07C50A@gmail.com> <3cb1126796176debe28aa66672ba27ae@yaccman.com> <20200511005745.GL17035@mcvoy.com> Message-ID: On 5/11/20, Greg A. Woods wrote: > > The lameness of typedef (and in how enum is related to typedef) is one > of the saddest parts of C. (The other is the default promotion to int.) I would add a third: file-scope declarations being global by default. One must use the keyword "static" to restrict a file-scope declaration to the file it's declared in. And why "static"? All file-scope declarations have static allocation. Why isn't the keyword "local" or "own"? Anyway, the way it ought to be is that file-scope declarations are restricted to the file they're declared in. To make the symbol visible outside its file, you should have to explicitly say "global". > It would be trivial to fix too -- for a "new" C, that is. Making it > backward compatible for legacy code would be tough, even with tooling to > help fix the worst issues. I've seen far too much code that would be > hard to fix by hand, e.g. some that even goes so far as to assume things > like arithmetic on enum values will produce other valid enum values. This ought to be easy to fix using a compiler command line option for the legacy behavior. Many C compilers do this already to support K&R semantics vs. standard C semantics. > Ideally enums could be a value in any native type, including float/double. Except pointers, of course. >> IMHO, without your semantics, enums are pretty useless, #define is good >> enough and more clear. > > Actually that's no longer true with a good modern toolchain, especially > with respect to the debugger. A good debugger can now show the enum > symbol for a (matching) value of a properly typedefed variable. Indeed. -Paul W. From lm at mcvoy.com Tue May 12 04:37:20 2020 From: lm at mcvoy.com (Larry McVoy) Date: Mon, 11 May 2020 11:37:20 -0700 Subject: [TUHS] v7 K&R C In-Reply-To: References: <6D6EFA0C-36C3-4225-A331-D1998A07C50A@gmail.com> <3cb1126796176debe28aa66672ba27ae@yaccman.com> <20200511005745.GL17035@mcvoy.com> Message-ID: <20200511183720.GR17035@mcvoy.com> On Mon, May 11, 2020 at 02:25:15PM -0400, Paul Winalski wrote: > On 5/11/20, Greg A. Woods wrote: > > > > The lameness of typedef (and in how enum is related to typedef) is one > > of the saddest parts of C. (The other is the default promotion to int.) > > I would add a third: file-scope declarations being global by default. > One must use the keyword "static" to restrict a file-scope declaration > to the file it's declared in. And why "static"? All file-scope I never cared for "static" either, seemed weird. All my code is #define private static private int super_duper(void) { ... } and everyone knows what that means at a glance. > declarations have static allocation. Why isn't the keyword "local" or > "own"? Anyway, the way it ought to be is that file-scope declarations > are restricted to the file they're declared in. To make the symbol > visible outside its file, you should have to explicitly say "global". > > > It would be trivial to fix too -- for a "new" C, that is. Making it > > backward compatible for legacy code would be tough, even with tooling to > > help fix the worst issues. I've seen far too much code that would be > > hard to fix by hand, e.g. some that even goes so far as to assume things > > like arithmetic on enum values will produce other valid enum values. > > This ought to be easy to fix using a compiler command line option for > the legacy behavior. Many C compilers do this already to support K&R > semantics vs. standard C semantics. > > > Ideally enums could be a value in any native type, including float/double. > > Except pointers, of course. > > >> IMHO, without your semantics, enums are pretty useless, #define is good > >> enough and more clear. > > > > Actually that's no longer true with a good modern toolchain, especially > > with respect to the debugger. A good debugger can now show the enum > > symbol for a (matching) value of a properly typedefed variable. > > Indeed. > > -Paul W. -- --- Larry McVoy lm at mcvoy.com http://www.mcvoy.com/lm From clemc at ccc.com Tue May 12 04:37:12 2020 From: clemc at ccc.com (Clem Cole) Date: Mon, 11 May 2020 14:37:12 -0400 Subject: [TUHS] v7 K&R C In-Reply-To: References: <6D6EFA0C-36C3-4225-A331-D1998A07C50A@gmail.com> <3cb1126796176debe28aa66672ba27ae@yaccman.com> <20200511005745.GL17035@mcvoy.com> Message-ID: On Mon, May 11, 2020 at 2:25 PM Paul Winalski wrote: > This ought to be easy to fix using a compiler command line option for > the legacy behavior. Many C compilers do this already to support K&R > semantics vs. standard C semantics. > Hrrrumph Point taken but ... C++ is an example in my mind of not listening to Dennis' words: - “C is quirky, flawed, and an enormous success.” - “When I read commentary about suggestions for where C should go, I often think back and give thanks that it wasn't developed under the advice of a worldwide crowd.” - “A language that doesn't have everything is actually easier to program in than some that do” -------------- next part -------------- An HTML attachment was scrubbed... URL: From paul.winalski at gmail.com Tue May 12 05:12:32 2020 From: paul.winalski at gmail.com (Paul Winalski) Date: Mon, 11 May 2020 15:12:32 -0400 Subject: [TUHS] v7 K&R C In-Reply-To: References: <6D6EFA0C-36C3-4225-A331-D1998A07C50A@gmail.com> <3cb1126796176debe28aa66672ba27ae@yaccman.com> <20200511005745.GL17035@mcvoy.com> Message-ID: On 5/11/20, Clem Cole wrote: > > C++ is an example in my mind of not listening to Dennis' words: > > - “C is quirky, flawed, and an enormous success.” Ditto Fortran. > - “When I read commentary about suggestions for where C should go, I > often think back and give thanks that it wasn't developed under the > advice > of a worldwide crowd.” The old saying of an elephant being a mouse designed by committee comes to mind. Language standards committees tend to be like a pack of dogs contemplating a tree. Each dog isn't satisfied with the tree until he's peed on it. > - “A language that doesn't have everything is actually easier to program > in than some that do” Big, comprehensive languages such as PL/I, Ada, and C++ tend to have more of their share of toxic language features--things that shouldn't be used if you want reliable, easily maintained and understood code. Ada failed for two reasons: [1] it had cooties because of its military origins, and [2] it collapsed under the weight of all of its features. -Paul W. From joe at via.net Tue May 12 05:57:01 2020 From: joe at via.net (joe mcguckin) Date: Mon, 11 May 2020 12:57:01 -0700 Subject: [TUHS] v7 K&R C In-Reply-To: References: <6D6EFA0C-36C3-4225-A331-D1998A07C50A@gmail.com> <3cb1126796176debe28aa66672ba27ae@yaccman.com> <20200511005745.GL17035@mcvoy.com> Message-ID: <357EFE54-BD94-4C10-8C43-C6735BF7D317@via.net> Maybe it’s time for C++ subset ‘G' Joe McGuckin ViaNet Communications joe at via.net 650-207-0372 cell 650-213-1302 office 650-969-2124 fax > On May 11, 2020, at 12:12 PM, Paul Winalski wrote: > > On 5/11/20, Clem Cole wrote: >> >> C++ is an example in my mind of not listening to Dennis' words: >> >> - “C is quirky, flawed, and an enormous success.” > > Ditto Fortran. > >> - “When I read commentary about suggestions for where C should go, I >> often think back and give thanks that it wasn't developed under the >> advice >> of a worldwide crowd.” > > The old saying of an elephant being a mouse designed by committee comes to mind. > > Language standards committees tend to be like a pack of dogs > contemplating a tree. Each dog isn't satisfied with the tree until > he's peed on it. > >> - “A language that doesn't have everything is actually easier to program >> in than some that do” > > Big, comprehensive languages such as PL/I, Ada, and C++ tend to have > more of their share of toxic language features--things that shouldn't > be used if you want reliable, easily maintained and understood code. > Ada failed for two reasons: [1] it had cooties because of its > military origins, and [2] it collapsed under the weight of all of its > features. > > -Paul W. From lm at mcvoy.com Tue May 12 06:25:55 2020 From: lm at mcvoy.com (Larry McVoy) Date: Mon, 11 May 2020 13:25:55 -0700 Subject: [TUHS] v7 K&R C In-Reply-To: <357EFE54-BD94-4C10-8C43-C6735BF7D317@via.net> References: <3cb1126796176debe28aa66672ba27ae@yaccman.com> <20200511005745.GL17035@mcvoy.com> <357EFE54-BD94-4C10-8C43-C6735BF7D317@via.net> Message-ID: <20200511202555.GU17035@mcvoy.com> Isn't that effectively what companies do now? Don't they all have a "Here is what you can use, this and nothing else" doc? On Mon, May 11, 2020 at 12:57:01PM -0700, joe mcguckin wrote: > Maybe it???s time for C++ subset ???G' > > > Joe McGuckin > ViaNet Communications > > joe at via.net > 650-207-0372 cell > 650-213-1302 office > 650-969-2124 fax > > > > > On May 11, 2020, at 12:12 PM, Paul Winalski wrote: > > > > On 5/11/20, Clem Cole wrote: > >> > >> C++ is an example in my mind of not listening to Dennis' words: > >> > >> - ???C is quirky, flawed, and an enormous success.??? > > > > Ditto Fortran. > > > >> - ???When I read commentary about suggestions for where C should go, I > >> often think back and give thanks that it wasn't developed under the > >> advice > >> of a worldwide crowd.??? > > > > The old saying of an elephant being a mouse designed by committee comes to mind. > > > > Language standards committees tend to be like a pack of dogs > > contemplating a tree. Each dog isn't satisfied with the tree until > > he's peed on it. > > > >> - ???A language that doesn't have everything is actually easier to program > >> in than some that do??? > > > > Big, comprehensive languages such as PL/I, Ada, and C++ tend to have > > more of their share of toxic language features--things that shouldn't > > be used if you want reliable, easily maintained and understood code. > > Ada failed for two reasons: [1] it had cooties because of its > > military origins, and [2] it collapsed under the weight of all of its > > features. > > > > -Paul W. -- --- Larry McVoy lm at mcvoy.com http://www.mcvoy.com/lm From dave at horsfall.org Tue May 12 14:15:30 2020 From: dave at horsfall.org (Dave Horsfall) Date: Tue, 12 May 2020 14:15:30 +1000 (EST) Subject: [TUHS] SDB debugger In-Reply-To: <2F4C604D-F01C-4A82-948A-7E77093B48A1@planet.nl> References: <2F4C604D-F01C-4A82-948A-7E77093B48A1@planet.nl> Message-ID: On Fri, 1 May 2020, Paul Ruizendaal wrote: > - On 6th edition the debugger was ‘cdb’ I thought it was straight "db"... -- Dave From dave at horsfall.org Tue May 12 14:36:37 2020 From: dave at horsfall.org (Dave Horsfall) Date: Tue, 12 May 2020 14:36:37 +1000 (EST) Subject: [TUHS] SDB debugger In-Reply-To: <7A8D26C8-67DB-4676-92D1-7842209F1807@cfcl.com> References: <202005020252.0422qnFL066007@tahoe.cs.Dartmouth.EDU> <20200502174518.GC30768@mcvoy.com> <7A8D26C8-67DB-4676-92D1-7842209F1807@cfcl.com> Message-ID: On Sun, 3 May 2020, Rich Morin wrote: >> On May 2, 2020, at 10:45, Larry McVoy wrote: >> >> ... But truth be known, I'm sort of a printf() debugger. ... [...] As am I, but the bug sometimes disappears :-( Code offsets, or something? -- Dave From paul.winalski at gmail.com Wed May 13 03:23:27 2020 From: paul.winalski at gmail.com (Paul Winalski) Date: Tue, 12 May 2020 13:23:27 -0400 Subject: [TUHS] v7 K&R C In-Reply-To: <20200511202555.GU17035@mcvoy.com> References: <3cb1126796176debe28aa66672ba27ae@yaccman.com> <20200511005745.GL17035@mcvoy.com> <357EFE54-BD94-4C10-8C43-C6735BF7D317@via.net> <20200511202555.GU17035@mcvoy.com> Message-ID: On 5/11/20, Larry McVoy wrote: > Isn't that effectively what companies do now? Don't they all have a > "Here is what you can use, this and nothing else" doc? > > On Mon, May 11, 2020 at 12:57:01PM -0700, joe mcguckin wrote: >> Maybe it???s time for C++ subset ???G' Absolutely. The projects that I ran effectively used C++ as a stronger-typed version of C. A small subset of C++ features were allowed, but among the prohibited features were: o multiple inheritance o operator overloading o friend classes o C++ exception handling o all std:: and STL functions The last two of these are mainly for performance reasons. throw and catch play merry hell with compiler optimizations, especially of global variables. -Paul W. From ron at ronnatalie.com Wed May 13 03:35:24 2020 From: ron at ronnatalie.com (ron at ronnatalie.com) Date: Tue, 12 May 2020 13:35:24 -0400 Subject: [TUHS] v7 K&R C In-Reply-To: References: <3cb1126796176debe28aa66672ba27ae@yaccman.com> <20200511005745.GL17035@mcvoy.com> <357EFE54-BD94-4C10-8C43-C6735BF7D317@via.net> <20200511202555.GU17035@mcvoy.com> Message-ID: > On 5/11/20, Larry McVoy wrote: > o all std:: and STL functions > > The last two of these are mainly for performance reasons. throw and > catch play merry hell with compiler optimizations, especially of > global variables. You'll have to explain to me how templates or the standard library (which by the way includes all of the C stuff) affects performance. In fact, we use templates to INCREASE rather than decrease performance. Templating is almost entirely compile time rewrites. From lm at mcvoy.com Wed May 13 03:42:57 2020 From: lm at mcvoy.com (Larry McVoy) Date: Tue, 12 May 2020 10:42:57 -0700 Subject: [TUHS] v7 K&R C In-Reply-To: References: <20200511005745.GL17035@mcvoy.com> <357EFE54-BD94-4C10-8C43-C6735BF7D317@via.net> <20200511202555.GU17035@mcvoy.com> Message-ID: <20200512174257.GB9381@mcvoy.com> Just a note, you seemed like you are replying to me (see below) but what you quoted Paul wrote. I am most certainly NOT putting myself out there as a C++ expert, I'm a C guy through and through. On Tue, May 12, 2020 at 01:35:24PM -0400, ron at ronnatalie.com wrote: > > On 5/11/20, Larry McVoy wrote: > > > o all std:: and STL functions > > > > The last two of these are mainly for performance reasons. throw and > > catch play merry hell with compiler optimizations, especially of > > global variables. > > You'll have to explain to me how templates or the standard library (which > by the way includes all of the C stuff) affects performance. In fact, we > use templates to INCREASE rather than decrease performance. Templating > is almost entirely compile time rewrites. -- --- Larry McVoy lm at mcvoy.com http://www.mcvoy.com/lm From paul.winalski at gmail.com Wed May 13 04:36:38 2020 From: paul.winalski at gmail.com (Paul Winalski) Date: Tue, 12 May 2020 14:36:38 -0400 Subject: [TUHS] v7 K&R C In-Reply-To: References: <3cb1126796176debe28aa66672ba27ae@yaccman.com> <20200511005745.GL17035@mcvoy.com> <357EFE54-BD94-4C10-8C43-C6735BF7D317@via.net> <20200511202555.GU17035@mcvoy.com> Message-ID: On 5/12/20, ron at ronnatalie.com wrote: >> On 5/11/20, Larry McVoy wrote: > >> o all std:: and STL functions >> >> The last two of these are mainly for performance reasons. throw and >> catch play merry hell with compiler optimizations, especially of >> global variables. > > You'll have to explain to me how templates or the standard library (which > by the way includes all of the C stuff) affects performance. In fact, we > use templates to INCREASE rather than decrease performance. Templating > is almost entirely compile time rewrites. The C++ standard libraries make heavy use of throw/catch exception handling. If routine A calls routine B, and B is known by the compiler to have the capability to throw exceptions, a bunch of important optimizations can't be done. For example: o You can't keep global values in registers around the call to B because the handler that catches an exception that B throws might use that global variable. So you have to spill the value around the call. o You can't do value propagation of global variables around the call to B because a handler might change their values. And it gets a lot worse when you start doing parallel loop execution. I implemented a new design for exception handling in a C/C++ compiler back end, and I found lots of corner cases where the C++ standard was silent as to what should happen when exceptions are thrown or caught from parallel threads. Things such as the order of execution of constructors and destructors for parallel routines when a thrown exception is unwound, and which side of the parallelization executes constructors and destructors under those conditions. The committee just plain never considered those issues. -Paul W. From tangentdelta at protonmail.com Wed May 13 05:36:43 2020 From: tangentdelta at protonmail.com (TangentDelta) Date: Tue, 12 May 2020 19:36:43 +0000 Subject: [TUHS] Linotron 202 Information Message-ID: Hello. I have a pair of controller card cages out of Mergenthaler Linotron 202 photo-typesetting machines. Sadly the machines themselves were scrapped, and these card cages are all I was able to save. The controllers use Computer Automation Naked Mini processors, which are relatively small 16-bit minicomputers designed for embedded control applications. I've been hacking on these for a few months now and have built up a system bus pinout diagram and several schematics. I haven't been able to find any technical information online in regards to the specific model of Naked Mini processor used in the 202, but I have found a trove of documents for other Naked Mini models on Bitsavers. I pulled the 512x4bit "boot ROM" mentioned in the "Experience with the Mergenthaler Linotron 202 Phototypesetter, or, How We Spent Our Summer Vacation" paper and dumped it, but the resulting binary doesn't produce any sane-looking code when manually disassembled using the documents on Bitsavers and reference, no matter how I arrange the nybbles. The processor also does not appear to respect the control opcodes issued by the Computer Automation LSI series programming console that I obtained. This has led me to the hypothesis that this is not a stock "Naked Mini" or later "Naked Milli" processor, but something specific to Mergenthaler. My goal is to get the processor to run my own code, and eventually design my own MaxiBus peripherals to use with it. If anyone knows where I can look for more information in regards to the 202 and the Naked Mini processor, or has any stories of working on these machines, I would greatly appreciate it! Thanks. -------------- next part -------------- An HTML attachment was scrubbed... URL: From dave at horsfall.org Thu May 14 09:36:57 2020 From: dave at horsfall.org (Dave Horsfall) Date: Thu, 14 May 2020 09:36:57 +1000 (EST) Subject: [TUHS] v7 K&R C In-Reply-To: References: <3cb1126796176debe28aa66672ba27ae@yaccman.com> <20200511005745.GL17035@mcvoy.com> <357EFE54-BD94-4C10-8C43-C6735BF7D317@via.net> <20200511202555.GU17035@mcvoy.com> Message-ID: On Tue, 12 May 2020, Paul Winalski wrote: > Absolutely. The projects that I ran effectively used C++ as a > stronger-typed version of C. A small subset of C++ features were > allowed, but among the prohibited features were: [...] > o operator overloading [...] I never could figure out why Stroustrup implemented that "feature"; let's see, this operator usually means this, except when you use it in that situation in which case it means something else. Now, try debugging that. I had to learn C++ for a project at $WORK years ago (the client demanded it), and boy was I glad when I left... -- Dave From jpl.jpl at gmail.com Thu May 14 10:42:55 2020 From: jpl.jpl at gmail.com (John P. Linderman) Date: Wed, 13 May 2020 20:42:55 -0400 Subject: [TUHS] v7 K&R C In-Reply-To: References: <3cb1126796176debe28aa66672ba27ae@yaccman.com> <20200511005745.GL17035@mcvoy.com> <357EFE54-BD94-4C10-8C43-C6735BF7D317@via.net> <20200511202555.GU17035@mcvoy.com> Message-ID: I never liked call by reference. When I was trying to understand a chunk of code, it was a great mental simplification to know that whatever a called routine did, it couldn't have an effect on the code I was trying to understand except through a returned value and (ghastly) global variables. Operator overloading is far worse. Now I can't even be sure code I'm looking at is doing what I thought it did. On Wed, May 13, 2020 at 7:38 PM Dave Horsfall wrote: > On Tue, 12 May 2020, Paul Winalski wrote: > > > Absolutely. The projects that I ran effectively used C++ as a > > stronger-typed version of C. A small subset of C++ features were > > allowed, but among the prohibited features were: > > [...] > > > o operator overloading > > [...] > > I never could figure out why Stroustrup implemented that "feature"; let's > see, this operator usually means this, except when you use it in that > situation in which case it means something else. Now, try debugging that. > > I had to learn C++ for a project at $WORK years ago (the client demanded > it), and boy was I glad when I left... > > -- Dave > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rdm at cfcl.com Thu May 14 12:44:25 2020 From: rdm at cfcl.com (Rich Morin) Date: Wed, 13 May 2020 19:44:25 -0700 Subject: [TUHS] v7 K&R C In-Reply-To: References: <3cb1126796176debe28aa66672ba27ae@yaccman.com> <20200511005745.GL17035@mcvoy.com> <357EFE54-BD94-4C10-8C43-C6735BF7D317@via.net> <20200511202555.GU17035@mcvoy.com> Message-ID: <71163EB4-683D-47DE-AAE2-93BF55C483E6@cfcl.com> > On May 13, 2020, at 17:42, John P. Linderman wrote: > > I never liked call by reference. When I was trying to understand a chunk of code, it was a great mental simplification to know that whatever a called routine did, it couldn't have an effect on the code I was trying to understand except through a returned value and (ghastly) global variables. ... A Fortran implementation I used years ago kept constants in a "literal pool". So, if you called a subroutine, passing in a constant, there was a possibility that the constant might be modified upon the routine's return. I don't recall this ever causing a problem in practice, but the possibility was amusing... -r From paul at mcjones.org Thu May 14 13:02:03 2020 From: paul at mcjones.org (Paul McJones) Date: Wed, 13 May 2020 20:02:03 -0700 Subject: [TUHS] v7 K&R C In-Reply-To: References: Message-ID: <4D7C0714-D559-4562-ADB5-8A0E313BC28E@mcjones.org> > On May 13, 2020, at 7:00 PM,Dave Horsfall wrote: > > I never could figure out why Stroustrup implemented that "feature"; let's > see, this operator usually means this, except when you use it in that > situation in which case it means something else. Now, try debugging that. C continues the tradition begun by Fortran and Algol 60 of overloading the arithmetic operators on the various numeric types. C++ allows new types to be defined; when a new type obeys the generally understood properties of a built-in type, it makes sense to use the same operator (or function) for the corresponding operation on the new type (e.g., addition on complex numbers, arbitrary-precision integers and rationals, polynomials, or matrices). -------------- next part -------------- An HTML attachment was scrubbed... URL: From charles.unix.pro at gmail.com Thu May 14 13:09:51 2020 From: charles.unix.pro at gmail.com (Charles Anthony) Date: Wed, 13 May 2020 20:09:51 -0700 Subject: [TUHS] v7 K&R C In-Reply-To: <71163EB4-683D-47DE-AAE2-93BF55C483E6@cfcl.com> References: <3cb1126796176debe28aa66672ba27ae@yaccman.com> <20200511005745.GL17035@mcvoy.com> <357EFE54-BD94-4C10-8C43-C6735BF7D317@via.net> <20200511202555.GU17035@mcvoy.com> <71163EB4-683D-47DE-AAE2-93BF55C483E6@cfcl.com> Message-ID: On Wed, May 13, 2020 at 7:45 PM Rich Morin wrote: > > On May 13, 2020, at 17:42, John P. Linderman wrote: > > > > I never liked call by reference. When I was trying to understand a chunk > of code, it was a great mental simplification to know that whatever a > called routine did, it couldn't have an effect on the code I was trying to > understand except through a returned value and (ghastly) global variables. > ... > > A Fortran implementation I used years ago kept constants in a "literal > pool". So, if you called a subroutine, passing in a constant, there was a > possibility that the constant might be modified upon the routine's return. > I don't recall this ever causing a problem in practice, but the possibility > was amusing... > Ah yes. A long time ago, some one came to me with a mysteriously behaving Pr1me FORTRAN program; after much head scratching, I found where they were changing the value of "0". -- Charles > > -- X-Clacks-Overhead: GNU Terry Pratchett -------------- next part -------------- An HTML attachment was scrubbed... URL: From woods at robohack.ca Thu May 14 14:21:34 2020 From: woods at robohack.ca (Greg A. Woods) Date: Wed, 13 May 2020 21:21:34 -0700 Subject: [TUHS] v7 K&R C In-Reply-To: References: <3cb1126796176debe28aa66672ba27ae@yaccman.com> <20200511005745.GL17035@mcvoy.com> <357EFE54-BD94-4C10-8C43-C6735BF7D317@via.net> <20200511202555.GU17035@mcvoy.com> Message-ID: At Thu, 14 May 2020 09:36:57 +1000 (EST), Dave Horsfall wrote: Subject: Re: [TUHS] v7 K&R C > > On Tue, 12 May 2020, Paul Winalski wrote: > > > o operator overloading > > [...] > > I never could figure out why Stroustrup implemented that "feature"; > let's see, this operator usually means this, except when you use it in > that situation in which case it means something else. Now, try > debugging that. Well in the true OO world the ability to "overload" a message (aka what is sometimes effectively an operator) allows a wise designer to apply the traditional meaning of that message (operator) to a new kind of object. Attempts to change the meaning of a message (operator) when applied to already well known objects is forbidden by good taste and sane reviewers. C++ being a bit of a dog's breakfast seems to have given some people the idea that they can get away with abusing operator overloading for what can only amount to obfuscation. -- Greg A. Woods Kelowna, BC +1 250 762-7675 RoboHack Planix, Inc. Avoncote Farms -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 195 bytes Desc: OpenPGP Digital Signature URL: From imp at bsdimp.com Thu May 14 14:40:51 2020 From: imp at bsdimp.com (Warner Losh) Date: Wed, 13 May 2020 22:40:51 -0600 Subject: [TUHS] v7 K&R C In-Reply-To: References: <3cb1126796176debe28aa66672ba27ae@yaccman.com> <20200511005745.GL17035@mcvoy.com> <357EFE54-BD94-4C10-8C43-C6735BF7D317@via.net> <20200511202555.GU17035@mcvoy.com> Message-ID: On Wed, May 13, 2020, 10:22 PM Greg A. Woods wrote: > At Thu, 14 May 2020 09:36:57 +1000 (EST), Dave Horsfall > wrote: > Subject: Re: [TUHS] v7 K&R C > > > > On Tue, 12 May 2020, Paul Winalski wrote: > > > > > o operator overloading > > > > [...] > > > > I never could figure out why Stroustrup implemented that "feature"; > > let's see, this operator usually means this, except when you use it in > > that situation in which case it means something else. Now, try > > debugging that. > > Well in the true OO world the ability to "overload" a message (aka what > is sometimes effectively an operator) allows a wise designer to apply > the traditional meaning of that message (operator) to a new kind of > object. Attempts to change the meaning of a message (operator) when > applied to already well known objects is forbidden by good taste and > sane reviewers. > > C++ being a bit of a dog's breakfast seems to have given some people the > idea that they can get away with abusing operator overloading for what > can only amount to obfuscation. > Queue rant about << and >> overloading... Warner -- > Greg A. Woods > > Kelowna, BC +1 250 762-7675 RoboHack > Planix, Inc. Avoncote Farms > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dave at horsfall.org Thu May 14 17:38:32 2020 From: dave at horsfall.org (Dave Horsfall) Date: Thu, 14 May 2020 17:38:32 +1000 (EST) Subject: [TUHS] v7 K&R C In-Reply-To: <71163EB4-683D-47DE-AAE2-93BF55C483E6@cfcl.com> References: <3cb1126796176debe28aa66672ba27ae@yaccman.com> <20200511005745.GL17035@mcvoy.com> <357EFE54-BD94-4C10-8C43-C6735BF7D317@via.net> <20200511202555.GU17035@mcvoy.com> <71163EB4-683D-47DE-AAE2-93BF55C483E6@cfcl.com> Message-ID: On Wed, 13 May 2020, Rich Morin wrote: > A Fortran implementation I used years ago kept constants in a "literal > pool". So, if you called a subroutine, passing in a constant, there was > a possibility that the constant might be modified upon the routine's > return. I don't recall this ever causing a problem in practice, but the > possibility was amusing... As I dimly recall, Fortran has always used call by value/result (or whatever the term is). So, if you modify an argument that happened to be passed as a constant... -- Dave From ron at ronnatalie.com Thu May 14 22:25:16 2020 From: ron at ronnatalie.com (ron at ronnatalie.com) Date: Thu, 14 May 2020 08:25:16 -0400 Subject: [TUHS] v7 K&R C In-Reply-To: References: <3cb1126796176debe28aa66672ba27ae@yaccman.com> <20200511005745.GL17035@mcvoy.com> <357EFE54-BD94-4C10-8C43-C6735BF7D317@via.net> <20200511202555.GU17035@mcvoy.com> <71163EB4-683D-47DE-AAE2-93BF55C483E6@cfcl.com> Message-ID: <3f3cebe6682520577ccab2f1b81627cc.squirrel@squirrelmail.tuffmail.net> > On Wed, 13 May 2020, Rich Morin wrote: > >> A Fortran implementation I used years ago kept constants in a "literal >> pool". So, if you called a subroutine, passing in a constant, there was >> a possibility that the constant might be modified upon the routine's >> return. I don't recall this ever causing a problem in practice, but the >> possibility was amusing... > > As I dimly recall, Fortran has always used call by value/result (or > whatever the term is). So, if you modify an argument that happened to be > passed as a constant... > Fortran argument passing to functions is call by reference. Some compilers had a non-standard exception to allow call by value. From ron at ronnatalie.com Thu May 14 22:27:07 2020 From: ron at ronnatalie.com (ron at ronnatalie.com) Date: Thu, 14 May 2020 08:27:07 -0400 Subject: [TUHS] v7 K&R C In-Reply-To: References: <3cb1126796176debe28aa66672ba27ae@yaccman.com> <20200511005745.GL17035@mcvoy.com> <357EFE54-BD94-4C10-8C43-C6735BF7D317@via.net> <20200511202555.GU17035@mcvoy.com> <71163EB4-683D-47DE-AAE2-93BF55C483E6@cfcl.com> Message-ID: > Ah yes. A long time ago, some one came to me with a mysteriously behaving > Pr1me FORTRAN program; after much head scratching, I found where they were > changing the value of "0". > It was right up there when I traced a bug to find someone had added this line to one of the headers: #define notdef 1 From ron at ronnatalie.com Thu May 14 22:27:08 2020 From: ron at ronnatalie.com (ron at ronnatalie.com) Date: Thu, 14 May 2020 08:27:08 -0400 Subject: [TUHS] v7 K&R C In-Reply-To: References: <3cb1126796176debe28aa66672ba27ae@yaccman.com> <20200511005745.GL17035@mcvoy.com> <357EFE54-BD94-4C10-8C43-C6735BF7D317@via.net> <20200511202555.GU17035@mcvoy.com> <71163EB4-683D-47DE-AAE2-93BF55C483E6@cfcl.com> Message-ID: <74dd164ee1b3e68fc8f36aef8a4a4369.squirrel@squirrelmail.tuffmail.net> > Ah yes. A long time ago, some one came to me with a mysteriously behaving > Pr1me FORTRAN program; after much head scratching, I found where they were > changing the value of "0". > It was right up there when I traced a bug to find someone had added this line to one of the headers: #define notdef 1 From ron at ronnatalie.com Thu May 14 22:27:10 2020 From: ron at ronnatalie.com (ron at ronnatalie.com) Date: Thu, 14 May 2020 08:27:10 -0400 Subject: [TUHS] v7 K&R C In-Reply-To: References: <3cb1126796176debe28aa66672ba27ae@yaccman.com> <20200511005745.GL17035@mcvoy.com> <357EFE54-BD94-4C10-8C43-C6735BF7D317@via.net> <20200511202555.GU17035@mcvoy.com> <71163EB4-683D-47DE-AAE2-93BF55C483E6@cfcl.com> Message-ID: > Ah yes. A long time ago, some one came to me with a mysteriously behaving > Pr1me FORTRAN program; after much head scratching, I found where they were > changing the value of "0". > It was right up there when I traced a bug to find someone had added this line to one of the headers: #define notdef 1 From arnold at skeeve.com Thu May 14 23:04:56 2020 From: arnold at skeeve.com (arnold at skeeve.com) Date: Thu, 14 May 2020 07:04:56 -0600 Subject: [TUHS] [off topic] anyone want some QIC-250 tapes? Message-ID: <202005141304.04ED4uXA023743@freefriends.org> I recently unearthed two 250 Mb QIC cartridges (like we used to use on Sun workstations). They were last written on in late 1997. I have no idea if they're any good, but they're free to anyone who'll pay postage (from Israel). If no takers, I'll just toss 'em. Thanks, Arnold From tangentdelta at protonmail.com Thu May 14 23:05:45 2020 From: tangentdelta at protonmail.com (TangentDelta) Date: Thu, 14 May 2020 13:05:45 +0000 Subject: [TUHS] Linotron 202 Information In-Reply-To: References: Message-ID: Here is a dump of the ROM in a text-based format. I couldn't think of a good way to represent the 4-bit words in a normal binary format with the order being ambiguous. Connecting a logic analyzer up to the ROM and triggering an "autoload" sequence, the processor reads ROM address 0, followed by ROM address 1, and then seems to lock up. I'm curious if the processor is attempting to store the 8-bit word into RAM for some reason? My RAM board is in very poor condition and I will need to devise a way to troubleshoot it. It'd also be helpful to have some of those control lines hooked up to the logic analyzer while it is happening. I'm working on a disassembler that should let me shuffle the order of the 4-bit words around until I get something that looks sane. ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐ On Wednesday, May 13, 2020 3:27 PM, Ken Thompson wrote: > can you send me the bits of the rom. > i will take a look. > > On Tue, May 12, 2020 at 12:44 PM TangentDelta via TUHS wrote: > >> Hello. >> >> I have a pair of controller card cages out of Mergenthaler Linotron 202 photo-typesetting machines. Sadly the machines themselves were scrapped, and these card cages are all I was able to save. >> >> The controllers use Computer Automation Naked Mini processors, which are relatively small 16-bit minicomputers designed for embedded control applications. I've been hacking on these for a few months now and have built up a system bus pinout diagram and several schematics. I haven't been able to find any technical information online in regards to the specific model of Naked Mini processor used in the 202, but I have found a trove of documents for other Naked Mini models on Bitsavers. >> >> I pulled the 512x4bit "boot ROM" mentioned in the "Experience with the Mergenthaler Linotron 202 Phototypesetter, or, How We Spent Our Summer Vacation" paper and dumped it, but the resulting binary doesn't produce any sane-looking code when manually disassembled using the documents on Bitsavers and reference, no matter how I arrange the nybbles. The processor also does not appear to respect the control opcodes issued by the Computer Automation LSI series programming console that I obtained. This has led me to the hypothesis that this is not a stock "Naked Mini" or later "Naked Milli" processor, but something specific to Mergenthaler. >> >> My goal is to get the processor to run my own code, and eventually design my own MaxiBus peripherals to use with it. >> >> If anyone knows where I can look for more information in regards to the 202 and the Naked Mini processor, or has any stories of working on these machines, I would greatly appreciate it! >> >> Thanks. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: bootrom_dump_line.txt URL: From paul.winalski at gmail.com Fri May 15 03:08:20 2020 From: paul.winalski at gmail.com (Paul Winalski) Date: Thu, 14 May 2020 13:08:20 -0400 Subject: [TUHS] v7 K&R C In-Reply-To: <4D7C0714-D559-4562-ADB5-8A0E313BC28E@mcjones.org> References: <4D7C0714-D559-4562-ADB5-8A0E313BC28E@mcjones.org> Message-ID: On 5/13/20, Paul McJones wrote: > > C continues the tradition begun by Fortran and Algol 60 of overloading the > arithmetic operators on the various numeric types. C++ allows new types to > be defined; when a new type obeys the generally understood properties of a > built-in type, it makes sense to use the same operator (or function) for the > corresponding operation on the new type (e.g., addition on complex numbers, > arbitrary-precision integers and rationals, polynomials, or matrices). I agree; that makes sense. But I don't like things such as << and >> in I/O-related classes. -Paul W. From paul.winalski at gmail.com Fri May 15 03:13:33 2020 From: paul.winalski at gmail.com (Paul Winalski) Date: Thu, 14 May 2020 13:13:33 -0400 Subject: [TUHS] v7 K&R C In-Reply-To: <71163EB4-683D-47DE-AAE2-93BF55C483E6@cfcl.com> References: <3cb1126796176debe28aa66672ba27ae@yaccman.com> <20200511005745.GL17035@mcvoy.com> <357EFE54-BD94-4C10-8C43-C6735BF7D317@via.net> <20200511202555.GU17035@mcvoy.com> <71163EB4-683D-47DE-AAE2-93BF55C483E6@cfcl.com> Message-ID: On 5/13/20, Rich Morin wrote: > > A Fortran implementation I used years ago kept constants in a "literal > pool". So, if you called a subroutine, passing in a constant, there was a > possibility that the constant might be modified upon the routine's return. > I don't recall this ever causing a problem in practice, but the possibility > was amusing... Any modern compiler worth its salt does literal pooling. Fortunately modern operating systems have the concept of read-only address space. These days attempts to modify literal pool constants will give you a memory access violation at the point where the illegal modification was made. -Paul W. From lm at mcvoy.com Fri May 15 03:21:07 2020 From: lm at mcvoy.com (Larry McVoy) Date: Thu, 14 May 2020 10:21:07 -0700 Subject: [TUHS] v7 K&R C In-Reply-To: References: <20200511005745.GL17035@mcvoy.com> <357EFE54-BD94-4C10-8C43-C6735BF7D317@via.net> <20200511202555.GU17035@mcvoy.com> Message-ID: <20200514172107.GI20771@mcvoy.com> On Wed, May 13, 2020 at 08:42:55PM -0400, John P. Linderman wrote: > I never liked call by reference. When I was trying to understand a chunk of > code, it was a great mental simplification to know that whatever a called > routine did, it couldn't have an effect on the code I was trying to > understand except through a returned value and (ghastly) global variables. Call by value is fine for things like a single integer or whatever. When you have some giant array, you want to pass a pointer. And "const" helps a lot with indicating the subroutine isn't going to change it. From erc at pobox.com Fri May 15 03:27:14 2020 From: erc at pobox.com (Ed Carp) Date: Thu, 14 May 2020 12:27:14 -0500 Subject: [TUHS] [off topic] anyone want some QIC-250 tapes? In-Reply-To: <202005141304.04ED4uXA023743@freefriends.org> References: <202005141304.04ED4uXA023743@freefriends.org> Message-ID: Are the readers still available anymore?

Virus-free. www.avast.com
On 5/14/20, arnold at skeeve.com wrote: > I recently unearthed two 250 Mb QIC cartridges (like we used to use on > Sun workstations). They were last written on in late 1997. > > I have no idea if they're any good, but they're free to anyone who'll > pay postage (from Israel). > > If no takers, I'll just toss 'em. > > Thanks, > > Arnold > From lm at mcvoy.com Fri May 15 03:32:06 2020 From: lm at mcvoy.com (Larry McVoy) Date: Thu, 14 May 2020 10:32:06 -0700 Subject: [TUHS] v7 K&R C In-Reply-To: References: <20200511005745.GL17035@mcvoy.com> <357EFE54-BD94-4C10-8C43-C6735BF7D317@via.net> <20200511202555.GU17035@mcvoy.com> Message-ID: <20200514173206.GJ20771@mcvoy.com> On Thu, May 14, 2020 at 09:36:57AM +1000, Dave Horsfall wrote: > I had to learn C++ for a project at $WORK years ago (the client demanded > it), and boy was I glad when I left... Amen. I'm being a whiney grumpy old man, but I'm sort of glad I'm at the tail end of my career. Going into it now, there are some bright spots, and some dim ones, Go seems nice, Rust could have been nice but they just had to come up with a different syntax, I can't see why anyone would do anything other than an improved C like syntax, Java and C++ seem awful, D tried but threw too much into the language like C++ did, if D had had some restraint like Go does, D would probably be my language of choice. Personally, I just want a modernized C. If you want to see what I want take a look at https://www.little-lang.org/ It's got some perl goodness, regexps are part of the syntax, switches work on strings or regexps as well as constants, it's pleasant. And completely doable as an extension to C. Oh, and it has reference counting on auto allocated stuff so when it goes out of scope, free() is automatic. --lm From clemc at ccc.com Fri May 15 03:58:44 2020 From: clemc at ccc.com (Clem Cole) Date: Thu, 14 May 2020 13:58:44 -0400 Subject: [TUHS] v7 K&R C In-Reply-To: References: <4D7C0714-D559-4562-ADB5-8A0E313BC28E@mcjones.org> Message-ID: On Thu, May 14, 2020 at 1:09 PM Paul Winalski wrote: > I agree; that makes sense. But I don't like things such as << and >> in > I/O-related classes. > Amen. Clem -------------- next part -------------- An HTML attachment was scrubbed... URL: From doug at cs.dartmouth.edu Fri May 15 04:41:57 2020 From: doug at cs.dartmouth.edu (Doug McIlroy) Date: Thu, 14 May 2020 14:41:57 -0400 Subject: [TUHS] v7 K&R C Message-ID: <202005141841.04EIfvEZ063529@tahoe.cs.Dartmouth.EDU> > o operator overloading > > I never could figure out why Stroustrup implemented that "feature"; let's > see, this operator usually means this, except when you use it in that > situation in which case it means something else. Now, try debugging that. Does your antipathy extend to C++ IO and its heavily overloaded << and >>? The essence of object-oriented programming is operator overloading. If you think integer.add(integer) and matrix.add(matrix) are good, perspicuous, consistent style, then you have to think that integer+integer and matrix+matrix are even better. To put it more forcefully: the OO style is revoltingly asymmetric. If you like it why don't you do everyday arithmetic that way? I strongly encouraged Bjarne to support operator overloading, used it to write beautiful code, and do not regret a bit of it. I will agree, though, that the coercion rules that come along with operator (and method) overloading are dauntingly complicated. However, for natural uses (e.g. mixed-mode arithmetic) the rules work intuitively and well. Mathematics has prospered on operator overloading, and that's why I wanted it. My only regret is that Bjarne chose to set the vocabulary of infix operators in stone. Because there's no way to inroduce new ones, users with poor taste are tempted to recycle the old ones for incongruous purposes. C++ offers more features than C and thus more ways to write obscure code. But when it happens, blame the writer, not the tool. Doug From rich.salz at gmail.com Fri May 15 04:45:35 2020 From: rich.salz at gmail.com (Richard Salz) Date: Thu, 14 May 2020 14:45:35 -0400 Subject: [TUHS] v7 K&R C In-Reply-To: <202005141841.04EIfvEZ063529@tahoe.cs.Dartmouth.EDU> References: <202005141841.04EIfvEZ063529@tahoe.cs.Dartmouth.EDU> Message-ID: > users with poor taste are tempted to recycle the old ones for incongruous Yes, and since most programmers have poor taste (members of this list being notable, and often self-proclaimed, exceptions) ... -------------- next part -------------- An HTML attachment was scrubbed... URL: From clemc at ccc.com Fri May 15 06:54:30 2020 From: clemc at ccc.com (Clem Cole) Date: Thu, 14 May 2020 16:54:30 -0400 Subject: [TUHS] v7 K&R C In-Reply-To: <202005141841.04EIfvEZ063529@tahoe.cs.Dartmouth.EDU> References: <202005141841.04EIfvEZ063529@tahoe.cs.Dartmouth.EDU> Message-ID: On Thu, May 14, 2020 at 2:42 PM Doug McIlroy wrote: > The essence of object-oriented programming is operator overloading. Mumble -- I'm not so sure ... Kay coined the term, and I've not directly taken that away from his writings. But maybe I missed it. I'm a little reluctant to argue here. I feel a little like I did when I was arguing with my thesis advisor years ago ;_) I so respect you opinion and you have demonstrated to me that you are correct on so many things. > > Mathematics has prospered on operator overloading, and that's why I > wanted it. FWIW: That was Wulf's argument for the BLISS syntax for indirection. It made more sense mathematically. The problem was that the animals were already beyond the fields and long lost in the forest, so closing the barn door later didn't help. Bill later recanted, that while the idea was the right one, in practice, he was a bad idea. > ... > users with poor taste are tempted to recycle the old ones for incongruous > purposes. > Ah, this here is, of course, the crux of the issue. Who shall be the arbiters of good taste. Doug most of the time, I agree with you and think you have done a fantastic job of being one of those arbiters. But like my friend and mentor Wulf, I have to admit the ugly way we did for years; stands. What you bought, given what we got, seems unbalanced. There is way too much 'bad' code and I think the overloading multiplies the bad over the good. > > C++ offers more features than C and thus more ways to write obscure code. > It's worse than that. The language definition is constantly peed on by the masses. Whereas Dennis (and Steve) took a very measured approach as to when and how to add features to C and while it is admittedly quirky, I find C code a lot more understandable. To me, the features that were added to C were ones that experience showed made sense [structs/unions/function prototypes/void/void*]. But as Larry and I have pointed out, not all of them did (enums). I don't have the same warm feelings about C++. My complaint with C++ was (is) it just 'too much'. If Bjorne had added classes and some of the original simpler things that his original "C with Classes" paper had and stopped, I think I might be willing to use it today. But like Larry, I avoid it if at all possible. In practice, its a tarbaby, and little good has come of it in my world. I applaud Rob, Ken, Brian and Russ with Go - I think thye hit on a better medium, certainly for userspace code. And thankfully they have not (so far) been tempted to 'fix it' (although I have heard rumors that Russ has things up his sleeve). And for me, the jury is still out on Rust (Dan Cross I admit got to me to rethink its value a bit, but I have not yet used it for anything). And Python, which I had hoped would be a reasonable replacement for Perl, also became a mess when people 'improved' it. > But when it happens, blame the writer, not the tool. > Fair point. I've seen some awesome BLISS code in my day. I bet if I looked I could find some excellent C++. But in my experience, the signal to noise ratio is not in favor of either. Respectfully, Clem -------------- next part -------------- An HTML attachment was scrubbed... URL: From dot at dotat.at Fri May 15 08:32:02 2020 From: dot at dotat.at (Tony Finch) Date: Thu, 14 May 2020 23:32:02 +0100 Subject: [TUHS] v7 K&R C In-Reply-To: <20200514173206.GJ20771@mcvoy.com> References: <20200511005745.GL17035@mcvoy.com> <357EFE54-BD94-4C10-8C43-C6735BF7D317@via.net> <20200511202555.GU17035@mcvoy.com> <20200514173206.GJ20771@mcvoy.com> Message-ID: Larry McVoy wrote: > > It's got some perl goodness, regexps are part of the syntax, .... I got into Unix after perl and I've used it a lot. Back in the 1990s I saw Henry Spencer's joke that perl was the Swiss Army Chainsaw of Unix, as a riff on lex being its Swiss Army Knife. I came to appreciate lex regrettably late: lex makes it remarkably easy to chew through a huge pile of text and feed the pieces to some library code written in C. I've been using re2c recently (http://re2c.org/), which is differently weird than lex, though it still uses YY in all its variable names. It's remarkable how much newer lexer/parser generators can't escape from the user interface of lex/yacc. Another YY example: http://www.hwaci.com/sw/lemon/ Tony. -- f.anthony.n.finch http://dotat.at/ Trafalgar: Cyclonic 6 to gale 8. Rough occasionally very rough in west and south. Thundery showers. Good, occasionally poor. From robpike at gmail.com Fri May 15 12:44:30 2020 From: robpike at gmail.com (Rob Pike) Date: Fri, 15 May 2020 12:44:30 +1000 Subject: [TUHS] v7 K&R C In-Reply-To: <202005141841.04EIfvEZ063529@tahoe.cs.Dartmouth.EDU> References: <202005141841.04EIfvEZ063529@tahoe.cs.Dartmouth.EDU> Message-ID: Perhaps for the first time in my career, I am about to disagree with Doug McIlroy. Sorry, Doug, but I feel the essence of object-oriented computing is not operator overloading but the representation of behavior. I know you love using o.o. in OO languages, but that is syntax, not semantics, and OO, not o.o., is about semantics. And of course, the purest of the OO languages do represent arithmetic as methods, but the fit of OO onto C was never going to be smooth. -rob On Fri, May 15, 2020 at 4:42 AM Doug McIlroy wrote: > > o operator overloading > > > > I never could figure out why Stroustrup implemented that "feature"; let's > > see, this operator usually means this, except when you use it in that > > situation in which case it means something else. Now, try debugging > that. > > Does your antipathy extend to C++ IO and its heavily overloaded << and >>? > > The essence of object-oriented programming is operator overloading. If you > think integer.add(integer) and matrix.add(matrix) are good, perspicuous, > consistent style, then you have to think that integer+integer and > matrix+matrix are even better. To put it more forcefully: the OO style > is revoltingly asymmetric. If you like it why don't you do everyday > arithmetic that way? > > I strongly encouraged Bjarne to support operator overloading, used it > to write beautiful code, and do not regret a bit of it. I will agree, > though, that the coercion rules that come along with operator (and > method) overloading are dauntingly complicated. However, for natural uses > (e.g. mixed-mode arithmetic) the rules work intuitively and well. > > Mathematics has prospered on operator overloading, and that's why I > wanted it. My only regret is that Bjarne chose to set the vocabulary of > infix operators in stone. Because there's no way to inroduce new ones, > users with poor taste are tempted to recycle the old ones for incongruous > purposes. > > C++ offers more features than C and thus more ways to write obscure code. > But when it happens, blame the writer, not the tool. > > Doug > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rdm at cfcl.com Fri May 15 13:57:01 2020 From: rdm at cfcl.com (Rich Morin) Date: Thu, 14 May 2020 20:57:01 -0700 Subject: [TUHS] v7 K&R C In-Reply-To: References: <202005141841.04EIfvEZ063529@tahoe.cs.Dartmouth.EDU> Message-ID: <311C6208-1BC9-4011-A6A8-38A148865DF3@cfcl.com> In case it helps, here is a definition of OO from Alan Kay: "OOP to me means only messaging, local retention and protection and hiding of state-process, and extreme late-binding of all things." -- http://userpage.fu-berlin.de/~ram/pub/pub_jf47ht81Ht/doc_kay_oop_en -r From iain at csp-partnership.co.uk Fri May 15 17:55:46 2020 From: iain at csp-partnership.co.uk (Dr Iain Maoileoin) Date: Fri, 15 May 2020 08:55:46 +0100 Subject: [TUHS] v7 K&R C In-Reply-To: References: <202005141841.04EIfvEZ063529@tahoe.cs.Dartmouth.EDU> Message-ID: Being Scottish and in the 70s our world was constrained by UK import restrictions - to protect our industries. As a boy I cut my teeth on a language called Algol68 that ran on a ICL 1904 (24 bit word and 6 bit byte, generally a capital letter only system!). The language was part of my academic course work. OK it was not a OO language but - in 1968 - it had strict type checking, structures, user-defined types, enums, void, casts, user-defined operators (overloaded) both infix and prefix, (all defined on a formal mathematical basis giving syntax and semantics) Together with “environment enquiries” to find out how big an int was or the precision of a float. Users could also define their own operators - think about it as no more that strange names of a variable or procedure - and also allocate priority to the various operators in that world (monadics ALWAYS had a priority of 10 and bound tightest). But it went too far. You could define (note that the concept of += did not exist in the base language in 1968) a new operator such as “+:=“ op +:= = (ref in a, int b) ref int: a:=a+b; € It took a pointer to an int, and int and returned the pointer [Of course you could also define it to be op +:= = (ref in a, int b) ref int: a:=a-b+7; ] You could even use Jensen’s device with operators. If you dont know ALgol68 have a speed read of https://research.vu.nl/ws/portalfiles/portal/74119499/11057 My move to unix and C in the 70s was a huge retro step for me - but I could not develop systems code in Algol68 - for example the transput library was about 8K before your blinked. Certainly in C we could code more and faster - no type-checking and we had enuf experience of compilers to understand what was going on at the machine code level - we could just drive the I/O registers directly. Then C++? Like microsoft windows I evaluated, tried it a bit and voted the theory good but the smell bad. I had a few students who wrote in C++ over a few years, but you know what, it did not do anything earth shattering and it could be a b*gger to work on a debug of a 20K line student program! Like some here I think C++ was just on the wrong side of a line that I dont understand. Similarly, for me, perl is on one side of that line and python is far over the other side. My question is: What is that line? I dont understand it? Effort input vs output? Complexity measure, debugging complexity in a 3rd party program? [I hated assembler too unless it was my own (or good) ;-)] But machine code was good, few people would do too much in a complicated way writing in binary/octal/hex! > On 15 May 2020, at 03:44, Rob Pike wrote: > > Perhaps for the first time in my career, I am about to disagree with Doug McIlroy. Sorry, Doug, but I feel the essence of object-oriented computing is not operator overloading but the representation of behavior. I know you love using o.o. in OO languages, but that is syntax, not semantics, and OO, not o.o., is about semantics. > > And of course, the purest of the OO languages do represent arithmetic as methods, but the fit of OO onto C was never going to be smooth. > > -rob > > > On Fri, May 15, 2020 at 4:42 AM Doug McIlroy > wrote: > > o operator overloading > > > > I never could figure out why Stroustrup implemented that "feature"; let's > > see, this operator usually means this, except when you use it in that > > situation in which case it means something else. Now, try debugging that. > > Does your antipathy extend to C++ IO and its heavily overloaded << and >>? > > The essence of object-oriented programming is operator overloading. If you > think integer.add(integer) and matrix.add(matrix) are good, perspicuous, > consistent style, then you have to think that integer+integer and > matrix+matrix are even better. To put it more forcefully: the OO style > is revoltingly asymmetric. If you like it why don't you do everyday > arithmetic that way? > > I strongly encouraged Bjarne to support operator overloading, used it > to write beautiful code, and do not regret a bit of it. I will agree, > though, that the coercion rules that come along with operator (and > method) overloading are dauntingly complicated. However, for natural uses > (e.g. mixed-mode arithmetic) the rules work intuitively and well. > > Mathematics has prospered on operator overloading, and that's why I > wanted it. My only regret is that Bjarne chose to set the vocabulary of > infix operators in stone. Because there's no way to inroduce new ones, > users with poor taste are tempted to recycle the old ones for incongruous > purposes. > > C++ offers more features than C and thus more ways to write obscure code. > But when it happens, blame the writer, not the tool. > > Doug > -------------- next part -------------- An HTML attachment was scrubbed... URL: From lm at mcvoy.com Sat May 16 01:01:22 2020 From: lm at mcvoy.com (Larry McVoy) Date: Fri, 15 May 2020 08:01:22 -0700 Subject: [TUHS] v7 K&R C In-Reply-To: References: <202005141841.04EIfvEZ063529@tahoe.cs.Dartmouth.EDU> Message-ID: <20200515150122.GF30160@mcvoy.com> On Fri, May 15, 2020 at 08:55:46AM +0100, Dr Iain Maoileoin wrote: > My question is: > What is that line? I dont understand it? Effort input vs output? > Complexity measure, debugging complexity in a 3rd party program? I think you are asking precisely the right question. There is a line where one side is good enough and good enough is just that. The other side is just too much, too much to get tangled up in. I think Rob and Ken and the rest of the Go team are not lauded enough for having the restraint to stay on the right side of the line. It's so easy to be seduced into yet another feature, "it's not that bad, we can do it." It is far harder to say "Nope, not gonna do that". The Go team reminds me a bit of the original QNX team. QNX was/is a message passing microkernel, it's the only microkernel that is actually micro (in my experience, I hear that L4 is good but haven't looked). They had a core team of 3 people that were allowed to touch the actual microkernel. All of which fit into a 4K instruction cache on x86. Every commit went through the filter of "does this add to the cache?" That sort of discipline is really rare. Far more common is "I benchmarked it and it only slows down by 1%" - that's death by a thousand paper cuts. From jpl.jpl at gmail.com Sat May 16 01:36:55 2020 From: jpl.jpl at gmail.com (John P. Linderman) Date: Fri, 15 May 2020 11:36:55 -0400 Subject: [TUHS] v7 K&R C In-Reply-To: <20200515150122.GF30160@mcvoy.com> References: <202005141841.04EIfvEZ063529@tahoe.cs.Dartmouth.EDU> <20200515150122.GF30160@mcvoy.com> Message-ID: On Fri, May 15, 2020 at 08:55:46AM +0100, Dr Iain Maoileoin wrote: > My question is: > What is that line? I dont understand it? Effort input vs output? > Complexity measure, debugging complexity in a 3rd party program? I think of it less as a line than as a continuum. I am reminded of an input routine I wrote for a sort several decades ago. Allocate a pointer from one end of memory, start reading a record into the other end of memory with something like while ((c = getchar()) != EOF) if (c == '\n') { /* entire record is now there */ Very easy to understand, no guesswork about allocating pointers versus records, and a complete and utter pig. 50% of the processing time of the entire sort went into loading records. There were 5 or 6 comparisons being done for every *character* of input (I omitted the bit about verifying that there was room to store the next character). It might have made for good reading, but nobody would have used it, because much faster sorts were already available, and most people *use* code, not *read* it. So I had to push into uglier territory to get something that worked well enough to be worth reading. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ron at ronnatalie.com Sat May 16 06:01:23 2020 From: ron at ronnatalie.com (ron at ronnatalie.com) Date: Fri, 15 May 2020 16:01:23 -0400 Subject: [TUHS] v7 K&R C In-Reply-To: References: <202005141841.04EIfvEZ063529@tahoe.cs.Dartmouth.EDU> <20200515150122.GF30160@mcvoy.com> Message-ID: <014001d62af3$9cc209b0$d6461d10$@ronnatalie.com> Unfortunately, if c is char on a machine with unsigned chars, or it’s of type unsigned char, the EOF will never be detected. * while ((c = getchar()) != EOF) if (c == '\n') { /* entire record is now there */ -------------- next part -------------- An HTML attachment was scrubbed... URL: From lm at mcvoy.com Sat May 16 06:03:54 2020 From: lm at mcvoy.com (Larry McVoy) Date: Fri, 15 May 2020 13:03:54 -0700 Subject: [TUHS] v7 K&R C In-Reply-To: <014001d62af3$9cc209b0$d6461d10$@ronnatalie.com> References: <202005141841.04EIfvEZ063529@tahoe.cs.Dartmouth.EDU> <20200515150122.GF30160@mcvoy.com> <014001d62af3$9cc209b0$d6461d10$@ronnatalie.com> Message-ID: <20200515200354.GL30160@mcvoy.com> I think every old Unix hand does int c; // EOF is typically -1 On Fri, May 15, 2020 at 04:01:23PM -0400, ron at ronnatalie.com wrote: > Unfortunately, if c is char on a machine with unsigned chars, or it???s of type unsigned char, the EOF will never be detected. > > > > > > > > * while ((c = getchar()) != EOF) if (c == '\n') { /* entire record is now there */ > > > -- --- Larry McVoy lm at mcvoy.com http://www.mcvoy.com/lm From clemc at ccc.com Sat May 16 06:05:02 2020 From: clemc at ccc.com (Clem Cole) Date: Fri, 15 May 2020 16:05:02 -0400 Subject: [TUHS] v7 K&R C In-Reply-To: <014001d62af3$9cc209b0$d6461d10$@ronnatalie.com> References: <202005141841.04EIfvEZ063529@tahoe.cs.Dartmouth.EDU> <20200515150122.GF30160@mcvoy.com> <014001d62af3$9cc209b0$d6461d10$@ronnatalie.com> Message-ID: Ron, Hmmm... getchar/getc are defined as returning int in the man page and C is traditionally defined as an int in this code.. On Fri, May 15, 2020 at 4:02 PM wrote: > Unfortunately, if c is char on a machine with unsigned chars, or it’s of > type unsigned char, the EOF will never be detected. > > > > > > > > > - while ((c = getchar()) != EOF) if (c == '\n') { /* entire record is > now there */ > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ron at ronnatalie.com Sat May 16 06:18:37 2020 From: ron at ronnatalie.com (ron at ronnatalie.com) Date: Fri, 15 May 2020 16:18:37 -0400 Subject: [TUHS] v7 K&R C In-Reply-To: References: <202005141841.04EIfvEZ063529@tahoe.cs.Dartmouth.EDU> <20200515150122.GF30160@mcvoy.com> <014001d62af3$9cc209b0$d6461d10$@ronnatalie.com> Message-ID: <019701d62af6$050344b0$0f09ce10$@ronnatalie.com> EOF is defined to be -1. getchar() returns int, but c is a unsigned char, the value of (c = getchar()) will be 255. This will never compare equal to -1. Ron, Hmmm... getchar/getc are defined as returning int in the man page and C is traditionally defined as an int in this code.. On Fri, May 15, 2020 at 4:02 PM > wrote: Unfortunately, if c is char on a machine with unsigned chars, or it’s of type unsigned char, the EOF will never be detected. * while ((c = getchar()) != EOF) if (c == '\n') { /* entire record is now there */ -------------- next part -------------- An HTML attachment was scrubbed... URL: From clemc at ccc.com Sat May 16 06:24:47 2020 From: clemc at ccc.com (Clem Cole) Date: Fri, 15 May 2020 16:24:47 -0400 Subject: [TUHS] v7 K&R C In-Reply-To: <019701d62af6$050344b0$0f09ce10$@ronnatalie.com> References: <202005141841.04EIfvEZ063529@tahoe.cs.Dartmouth.EDU> <20200515150122.GF30160@mcvoy.com> <014001d62af3$9cc209b0$d6461d10$@ronnatalie.com> <019701d62af6$050344b0$0f09ce10$@ronnatalie.com> Message-ID: I suspect we are saying the same thing. C is defined as an int (as Larry also showed), not an unsigned char (and frankly if you had done that, most modern compilers will give you a warning). IIRC you are correct that the Ritchie compiler would not catch that error. But, the truth is I know few C (experienced) programmers that would define c as anything but an int; particularly in the modern era with compiler warnings as good as they are. Clem On Fri, May 15, 2020 at 4:18 PM wrote: > EOF is defined to be -1. > > getchar() returns int, but c is a unsigned char, the value of (c = > getchar()) will be 255. This will never compare equal to -1. > > > > > > > > Ron, > > > > Hmmm... getchar/getc are defined as returning int in the man page and C is > traditionally defined as an int in this code.. > > > > On Fri, May 15, 2020 at 4:02 PM wrote: > > Unfortunately, if c is char on a machine with unsigned chars, or it’s of > type unsigned char, the EOF will never be detected. > > > > > > > > > - while ((c = getchar()) != EOF) if (c == '\n') { /* entire record is > now there */ > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From doug at cs.dartmouth.edu Sat May 16 06:34:59 2020 From: doug at cs.dartmouth.edu (Doug McIlroy) Date: Fri, 15 May 2020 16:34:59 -0400 Subject: [TUHS] v7 K&R C Message-ID: <202005152034.04FKYxuu070825@tahoe.cs.Dartmouth.EDU> > I feel the essence of object-oriented computing > is not operator overloading but the representation of behavior. Rob is right. Overloading is a universal characteristic of OO programming, but not the essence. Doug From imp at bsdimp.com Sat May 16 06:40:23 2020 From: imp at bsdimp.com (Warner Losh) Date: Fri, 15 May 2020 14:40:23 -0600 Subject: [TUHS] v7 K&R C In-Reply-To: <202005152034.04FKYxuu070825@tahoe.cs.Dartmouth.EDU> References: <202005152034.04FKYxuu070825@tahoe.cs.Dartmouth.EDU> Message-ID: On Fri, May 15, 2020, 2:35 PM Doug McIlroy wrote: > > I feel the essence of object-oriented computing > > is not operator overloading but the representation of behavior. > > Rob is right. Overloading is a universal characteristic > of OO programming, but not the essence. > I've viewed the essence as everything is an object and I can send messages to objects. From there, many different styles flow as different efforts leveraged different means and methods to purport to achieve that goal. Warner > -------------- next part -------------- An HTML attachment was scrubbed... URL: From usotsuki at buric.co Sat May 16 06:56:42 2020 From: usotsuki at buric.co (Steve Nickolas) Date: Fri, 15 May 2020 16:56:42 -0400 (EDT) Subject: [TUHS] v7 K&R C In-Reply-To: <014001d62af3$9cc209b0$d6461d10$@ronnatalie.com> References: <202005141841.04EIfvEZ063529@tahoe.cs.Dartmouth.EDU> <20200515150122.GF30160@mcvoy.com> <014001d62af3$9cc209b0$d6461d10$@ronnatalie.com> Message-ID: On Fri, 15 May 2020, ron at ronnatalie.com wrote: > Unfortunately, if c is char on a machine with unsigned chars, or it’s of > type unsigned char, the EOF will never be detected. Isn't it nonstandard (although I am aware of some compilers that do it) to default the type of char to unsigned? -uso. From mstiller at me.com Sat May 16 07:00:04 2020 From: mstiller at me.com (Michael Stiller) Date: Fri, 15 May 2020 23:00:04 +0200 Subject: [TUHS] 11/40 CPU (Emulation) test programs still available somewhere Message-ID: <15E49B16-E633-47B4-9CFB-42C863559CA4@me.com> Hi Legends, during corona times i “ported” Dave Cheney’s avr11 PDP11/40 emulator to a Teensy 4 MCU board, fixed already some bugs in it and added multiple rk drives and a partially working tm emulation. (space forward/reverse unimplemented) It runs V6 at about 1 Mips and is completely fine usable with multiple rk and tm drives. Playing around with V6 i noticed that Shoppa disk and tried to get ncc running, but after some debugging it looked like it is stuck in some loop which i think is due to possible bugs in the emulator code. So the question is, are there programs V6 runnable or rk bootable available which test the CPU functionality? I already hat a look at some xxdp disk, but the one i found (while they boot) seem to lack basic cpu tests. Best regards, Michael From richard at inf.ed.ac.uk Sat May 16 07:31:38 2020 From: richard at inf.ed.ac.uk (Richard Tobin) Date: Fri, 15 May 2020 22:31:38 +0100 (BST) Subject: [TUHS] v7 K&R C In-Reply-To: Steve Nickolas's message of Fri, 15 May 2020 16:56:42 -0400 (EDT) Message-ID: <20200515213138.8E0F72D2D71E@macaroni.inf.ed.ac.uk> > Isn't it nonstandard (although I am aware of some compilers that do it) to > default the type of char to unsigned? No. "The implementation shall define char to have the same range, representation, and behavior as either signed char or unsigned char." - C99 (Technically it's a separate type from both of them.) -- Richard -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. From usotsuki at buric.co Sat May 16 07:53:22 2020 From: usotsuki at buric.co (Steve Nickolas) Date: Fri, 15 May 2020 17:53:22 -0400 (EDT) Subject: [TUHS] v7 K&R C In-Reply-To: <20200515213138.8E0F72D2D71E@macaroni.inf.ed.ac.uk> References: <20200515213138.8E0F72D2D71E@macaroni.inf.ed.ac.uk> Message-ID: On Fri, 15 May 2020, Richard Tobin wrote: >> Isn't it nonstandard (although I am aware of some compilers that do it) to >> default the type of char to unsigned? > > No. > > "The implementation shall define char to have the same range, > representation, and behavior as either signed char or unsigned char." > - C99 > > (Technically it's a separate type from both of them.) > > -- Richard > > Huh. I thought all integers were supposed to be signed by default regardless of their size. o.o That said, I do use "int c; ... c=fgetc(stdin);" or the like in my code. -uso. From ron at ronnatalie.com Sat May 16 08:33:47 2020 From: ron at ronnatalie.com (ron at ronnatalie.com) Date: Fri, 15 May 2020 18:33:47 -0400 Subject: [TUHS] v7 K&R C In-Reply-To: References: <20200515213138.8E0F72D2D71E@macaroni.inf.ed.ac.uk> Message-ID: <077a01d62b08$e696bee0$b3c43ca0$@ronnatalie.com> Char is different. One of the silly foibles of C. char can be signed or unsigned at the implementation's decision. -----Original Message----- From: Steve Nickolas Sent: Friday, May 15, 2020 5:53 PM To: Richard Tobin Cc: Steve Nickolas ; ron at ronnatalie.com; tuhs at tuhs.org Subject: Re: [TUHS] v7 K&R C Huh. I thought all integers were supposed to be signed by default regardless of their size. o.o That said, I do use "int c; ... c=fgetc(stdin);" or the like in my code. -uso. From steffen at sdaoden.eu Sat May 16 09:34:27 2020 From: steffen at sdaoden.eu (Steffen Nurpmeso) Date: Sat, 16 May 2020 01:34:27 +0200 Subject: [TUHS] v7 K&R C In-Reply-To: <077a01d62b08$e696bee0$b3c43ca0$@ronnatalie.com> References: <20200515213138.8E0F72D2D71E@macaroni.inf.ed.ac.uk> <077a01d62b08$e696bee0$b3c43ca0$@ronnatalie.com> Message-ID: <20200515233427.31Vab%steffen@sdaoden.eu> ron at ronnatalie.com wrote in <077a01d62b08$e696bee0$b3c43ca0$@ronnatalie.com>: |Char is different. One of the silly foibles of C. char can be signed or |unsigned at the implementation's decision. And i would wish Thompson and Pike would have felt the need to design UTF-8 ten years earlier. Maybe we would have a halfway usable "wide" character interface in the standard (C) library. --steffen | |Der Kragenbaer, The moon bear, |der holt sich munter he cheerfully and one by one |einen nach dem anderen runter wa.ks himself off |(By Robert Gernhardt) From beebe at math.utah.edu Sat May 16 10:15:29 2020 From: beebe at math.utah.edu (Nelson H. F. Beebe) Date: Fri, 15 May 2020 18:15:29 -0600 Subject: [TUHS] v7 K&R C Message-ID: Discussions today on the TUHS list about the signed/unsigned nature of the C char type led me to reexamine logs of my feature test package at http://www.math.utah.edu/pub/features/ I had 170 build logs for it from 2017.11.07, so I moved those aside and ran another set of builds in our current enlarged test farm. That generated another 361 fresh builds. Those tests are all with the C compiler named "cc". I did not explore what other C compilers did, but I strongly suspect that they all agree on any single platform. On all but THREE systems, the tests report that "char" is signed, with CHAR_MAX == +127. The three outliers have char unsigned with CHAR_MAX == +255, and are * ARM armv7l Linux 4.13.1 (2017) and 5.6.7 (2020) * SGI O2 R10000-SC (150 MHz) IRIX 6.5 (2017 and 2020) * IBM POWER8 CentOS Linux release 7.4.1708 (AltArch) (2017) So, while the ISO C Standards, and historical practice, leave it implementation dependent whether char is signed or unsigned, there is a strong majority for a signed type. ------------------------------------------------------------------------------- - Nelson H. F. Beebe Tel: +1 801 581 5254 - - University of Utah FAX: +1 801 581 4148 - - Department of Mathematics, 110 LCB Internet e-mail: beebe at math.utah.edu - - 155 S 1400 E RM 233 beebe at acm.org beebe at computer.org - - Salt Lake City, UT 84112-0090, USA URL: http://www.math.utah.edu/~beebe/ - ------------------------------------------------------------------------------- From peter at rulingia.com Sat May 16 10:31:10 2020 From: peter at rulingia.com (Peter Jeremy) Date: Sat, 16 May 2020 10:31:10 +1000 Subject: [TUHS] v7 K&R C In-Reply-To: References: <202005141841.04EIfvEZ063529@tahoe.cs.Dartmouth.EDU> <20200515150122.GF30160@mcvoy.com> <014001d62af3$9cc209b0$d6461d10$@ronnatalie.com> Message-ID: <20200516003110.GA23382@server.rulingia.com> On 2020-May-15 16:56:42 -0400, Steve Nickolas wrote: >Isn't it nonstandard (although I am aware of some compilers that do it) to >default the type of char to unsigned? The standard allows "char" to be either signed or unsigned. The ARM ABI defines char as unsigned. I recall that Lattice C on the M68K allowed either signed or unsigned char via a flag. Setting it to "unsigned" generally produced faster code on my Amiga, though some code assumed signed chars and broke. -- Peter Jeremy -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From steffen at sdaoden.eu Sat May 16 10:28:05 2020 From: steffen at sdaoden.eu (Steffen Nurpmeso) Date: Sat, 16 May 2020 02:28:05 +0200 Subject: [TUHS] v7 K&R C In-Reply-To: References: Message-ID: <20200516002805.NJgvh%steffen@sdaoden.eu> Nelson H. F. Beebe wrote in : |Discussions today on the TUHS list about the signed/unsigned nature of |the C char type led me to reexamine logs of my feature test package at | | http://www.math.utah.edu/pub/features/ | |I had 170 build logs for it from 2017.11.07, so I moved those aside |and ran another set of builds in our current enlarged test farm. That |generated another 361 fresh builds. Those tests are all with the C |compiler named "cc". I did not explore what other C compilers did, |but I strongly suspect that they all agree on any single platform. | |On all but THREE systems, the tests report that "char" is signed, with |CHAR_MAX == +127. | |The three outliers have char unsigned with CHAR_MAX == +255, and are | | * ARM armv7l Linux 4.13.1 (2017) and 5.6.7 (2020) | * SGI O2 R10000-SC (150 MHz) IRIX 6.5 (2017 and 2020) | * IBM POWER8 CentOS Linux release 7.4.1708 (AltArch) (2017) | |So, while the ISO C Standards, and historical practice, leave it |implementation dependent whether char is signed or unsigned, there is |a strong majority for a signed type. Just to note Linus Torvalds "famous" "It better had been unsigned, Virginia". --steffen | |Der Kragenbaer, The moon bear, |der holt sich munter he cheerfully and one by one |einen nach dem anderen runter wa.ks himself off |(By Robert Gernhardt) From jpl.jpl at gmail.com Sat May 16 10:43:22 2020 From: jpl.jpl at gmail.com (John P. Linderman) Date: Fri, 15 May 2020 20:43:22 -0400 Subject: [TUHS] v7 K&R C In-Reply-To: <014001d62af3$9cc209b0$d6461d10$@ronnatalie.com> References: <202005141841.04EIfvEZ063529@tahoe.cs.Dartmouth.EDU> <20200515150122.GF30160@mcvoy.com> <014001d62af3$9cc209b0$d6461d10$@ronnatalie.com> Message-ID: If I had been thick enough to declare c as an unsigned char, it would have taken a good bit more than 50% of the time, and wouldn't have worked at all. On Fri, May 15, 2020 at 4:02 PM wrote: > Unfortunately, if c is char on a machine with unsigned chars, or it’s of > type unsigned char, the EOF will never be detected. > > > > > > > > > - while ((c = getchar()) != EOF) if (c == '\n') { /* entire record is > now there */ > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From imp at bsdimp.com Sat May 16 10:49:44 2020 From: imp at bsdimp.com (Warner Losh) Date: Fri, 15 May 2020 18:49:44 -0600 Subject: [TUHS] Status of Net/2 Message-ID: What's the current status of net/2? I ask because I have a FreeBSD 1.1.5.1 CVS repo that I'd like to make available. Some of the files in it are encumbered, though, and the University of California has communicated that fact. But what does that actually mean now that V7 has been released and that's what the files were based on? Are they no longer encumbered? Warner -------------- next part -------------- An HTML attachment was scrubbed... URL: From clemc at ccc.com Sat May 16 10:56:20 2020 From: clemc at ccc.com (Clem Cole) Date: Fri, 15 May 2020 20:56:20 -0400 Subject: [TUHS] Status of Net/2 In-Reply-To: References: Message-ID: At this point I believe that it is now clear. It’s still based on V7 but All of that is covered by the ancient system license/release of a few years ago. On Fri, May 15, 2020 at 8:50 PM Warner Losh wrote: > What's the current status of net/2? > > I ask because I have a FreeBSD 1.1.5.1 CVS repo that I'd like to make > available. Some of the files in it are encumbered, though, and the > University of California has communicated that fact. But what does that > actually mean now that V7 has been released and that's what the files were > based on? Are they no longer encumbered? > > Warner > > -- Sent from a handheld expect more typos than usual -------------- next part -------------- An HTML attachment was scrubbed... URL: From brantley at coraid.com Sat May 16 10:57:28 2020 From: brantley at coraid.com (Brantley Coile) Date: Sat, 16 May 2020 00:57:28 +0000 Subject: [TUHS] v7 K&R C In-Reply-To: <019701d62af6$050344b0$0f09ce10$@ronnatalie.com> References: <202005141841.04EIfvEZ063529@tahoe.cs.Dartmouth.EDU> <20200515150122.GF30160@mcvoy.com> <014001d62af3$9cc209b0$d6461d10$@ronnatalie.com> <019701d62af6$050344b0$0f09ce10$@ronnatalie.com> Message-ID: <0A2C62EF-E43E-43E4-8C53-CE4C99BC5B32@coraid.com> I always kept local, single characters in ints. This avoided the problem with loading a character being signed or unsigned. The reason for not specifying is obvious. Today, you can pick the move-byte-into-word instruction that either sign extends or doesn't. But when C was defined that wasn't the case. Some machines sign extended when a byte was loaded into a register and some filled the upper bits with zero. For machines that filled with zero, a char was unsigned. If you forced the language to do one or the other, it would be expensive on the opposite kind of machine. It's one of the things that made C a good choice on a wide variety of machines. I guess I always "saw" the return value of the getchar() as being in a int sized register, at first namely R0, so kept the character values returned as ints. The actual EOF indication from a read is a return value of zero for the number of characters read. But I'm just making noise because I'm sure everyone knows all this. Brantley > On May 15, 2020, at 4:18 PM, ron at ronnatalie.com wrote: > > EOF is defined to be -1. > getchar() returns int, but c is a unsigned char, the value of (c = getchar()) will be 255. This will never compare equal to -1. > > > > Ron, > > Hmmm... getchar/getc are defined as returning int in the man page and C is traditionally defined as an int in this code.. > > On Fri, May 15, 2020 at 4:02 PM wrote: >> Unfortunately, if c is char on a machine with unsigned chars, or it’s of type unsigned char, the EOF will never be detected. >> >> >> >>> • while ((c = getchar()) != EOF) if (c == '\n') { /* entire record is now there */ From lm at mcvoy.com Sat May 16 11:26:41 2020 From: lm at mcvoy.com (Larry McVoy) Date: Fri, 15 May 2020 18:26:41 -0700 Subject: [TUHS] v7 K&R C In-Reply-To: <20200515233427.31Vab%steffen@sdaoden.eu> References: <20200515213138.8E0F72D2D71E@macaroni.inf.ed.ac.uk> <077a01d62b08$e696bee0$b3c43ca0$@ronnatalie.com> <20200515233427.31Vab%steffen@sdaoden.eu> Message-ID: <20200516012641.GV30160@mcvoy.com> On Sat, May 16, 2020 at 01:34:27AM +0200, Steffen Nurpmeso wrote: > ron at ronnatalie.com wrote in > <077a01d62b08$e696bee0$b3c43ca0$@ronnatalie.com>: > |Char is different. One of the silly foibles of C. char can be signed or > |unsigned at the implementation's decision. > > And i would wish Thompson and Pike would have felt the need to > design UTF-8 ten years earlier. Maybe we would have a halfway > usable "wide" character interface in the standard (C) library. Yeah, I agree. UTF-8 is clever, really clever. It makes the other stuff look ham handed. From grog at lemis.com Sat May 16 11:39:30 2020 From: grog at lemis.com (Greg 'groggy' Lehey) Date: Sat, 16 May 2020 11:39:30 +1000 Subject: [TUHS] Status of Net/2 In-Reply-To: References: Message-ID: <20200516013930.GG1670@eureka.lemis.com> On Friday, 15 May 2020 at 18:49:44 -0600, Warner Losh wrote: > What's the current status of net/2? > > I ask because I have a FreeBSD 1.1.5.1 CVS repo that I'd like to make > available. Some of the files in it are encumbered, though, and the > University of California has communicated that fact. But what does that > actually mean now that V7 has been released and that's what the files were > based on? Are they no longer encumbered? To the best of my knowledge, Net/2 would be covered by the license granted by Caldera on 23 January 2002: Caldera International, Inc. hereby grants a fee free license that includes the rights use, modify and distribute this named source code, including creating derived binary products created from the source code. The source code for which Caldera International, Inc. grants rights are limited to the following UNIX Operating Systems that operate on the 16-Bit PDP-11 CPU and early versions of the 32-Bit UNIX Operating System, with specific exclusion of UNIX System III and UNIX System V and successor operating systems: 32-bit 32V UNIX 16 bit UNIX Versions 1, 2, 3, 4, 5, 6, 7 I'm attaching the PDF of the license agreement, along with an email from Dion Johnson to wkt (misspelt as wht) the following day. It doesn't specifically address any particular operating system, but it was my understanding that this would free all BSD versions. Greg -- Sent from my desktop computer. Finger grog at lemis.com for PGP public key. See complete headers for address and phone numbers. This message is digitally signed. If your Microsoft mail program reports problems, please read http://lemis.com/broken-MUA -------------- next part -------------- A non-text attachment was scrubbed... Name: ancient-source-all.pdf Type: application/pdf Size: 12299 bytes Desc: not available URL: -------------- next part -------------- An embedded message was scrubbed... From: Dion Johnson Subject: Liberal license for ancient UNIX sources Date: Wed, 23 Jan 2002 15:03:37 -0800 Size: 26173 URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 163 bytes Desc: not available URL: From imp at bsdimp.com Sat May 16 11:52:01 2020 From: imp at bsdimp.com (Warner Losh) Date: Fri, 15 May 2020 19:52:01 -0600 Subject: [TUHS] v7 K&R C In-Reply-To: References: Message-ID: On Fri, May 15, 2020 at 6:23 PM Nelson H. F. Beebe wrote: > Discussions today on the TUHS list about the signed/unsigned nature of > the C char type led me to reexamine logs of my feature test package at > > http://www.math.utah.edu/pub/features/ > > I had 170 build logs for it from 2017.11.07, so I moved those aside > and ran another set of builds in our current enlarged test farm. That > generated another 361 fresh builds. Those tests are all with the C > compiler named "cc". I did not explore what other C compilers did, > but I strongly suspect that they all agree on any single platform. > > On all but THREE systems, the tests report that "char" is signed, with > CHAR_MAX == +127. > > The three outliers have char unsigned with CHAR_MAX == +255, and are > > * ARM armv7l Linux 4.13.1 (2017) and 5.6.7 (2020) > * SGI O2 R10000-SC (150 MHz) IRIX 6.5 (2017 and 2020) > * IBM POWER8 CentOS Linux release 7.4.1708 (AltArch) (2017) > > So, while the ISO C Standards, and historical practice, leave it > implementation dependent whether char is signed or unsigned, there is > a strong majority for a signed type. > arm has been the biggest outlier in terms of unsigned char. In FreeBSD, this has been the second largest source of bugs with the platform... the OABI weird alignment requirements being the first (thankfully behind us)... Warner -------------- next part -------------- An HTML attachment was scrubbed... URL: From reed at reedmedia.net Sat May 16 12:28:01 2020 From: reed at reedmedia.net (Jeremy C. Reed) Date: Fri, 15 May 2020 21:28:01 -0500 (CDT) Subject: [TUHS] Status of Net/2 In-Reply-To: <20200516013930.GG1670@eureka.lemis.com> References: <20200516013930.GG1670@eureka.lemis.com> Message-ID: > To the best of my knowledge, Net/2 would be covered by the license > granted by Caldera on 23 January 2002: Except rulings since then said they never had the right (as they never owned the rights). ###### from my slide notes ######## TITLE=Who owns ancient Unix? IMAGE=images/netbsd-pcc-caldera-license-screenshot.png Some background: BULLET=Western Electric's patent department told the Bell Labs developers to remove all copyright notices from all Unix files. NOTE:They shipped code that did have a license agreement nevertheless. BULLET=These 1970's distributions pre-dated the US copyright law changes in 1989 (due to the Berne Convention) which made copyrights automatic. BULLET=Western Electric / BTL purposeful removal of copyrights may have meant forfeiture of copyright. Western Electric's patent department told the Bell Labs developers to remove all copyright notices from all Unix files. They shipped code that did have a license agreement nevertheless. These 1970's distributions pre-dated the US copyright law changes in 1989 (due to the Berne Convention) which made copyrights automatic. The 1970's purposeful removal of copyrights may have meant forfeiture of copyright. BULLET=In 1984, AT&T did retroactively copyright some of their ancient Unix code. BULLET=They also mistakenly placed their copyright on code copyrighted by the Regents of the University of California. In 1984, AT&T did retroactively copyright some of their ancient Unix code. They also placed their copyright on code copyrighted by the Regents of the University of California. Here is an example: https://github.com/att/uwin/blob/master/src/cmd/captoinfo/otermcap.c BULLET=Unix System Laboratories (USL) was formed for Bell Labs around 1989 for the responsibility for Unix development and Unix licensing activities. NOTE:It became a subsidiary of AT&T. Unix System Laboratories (USL) was formed for Bell Labs around 1989 for the responsibility for Unix development and Unix licensing activities. It became a subsidiary of AT&T. BULLET= In 1993, Novell purchases USL and its Unix assets (including copyrights). In 1993, Novell purchases USL and its Unix assets (including copyrights). NOTE: In regards to AT&T/Novell vs. UC/BSDI ... BULLET=In 1993, judge shared the opinion and again reaffirmed that USL "failed to demonstrate a likelihood that it can successfully defend its copyright in 32V" # Salus told me (around 2011) he was in the opinion (shared by the folks at Cravath, Swain..., IBM's lawyers) that V1-7 and 32V were covered by Judge Dickinson Debevoise's finding on 3 March 1993 (reaffirmed on 30 March 1993) that it was "unlikely" that Novell could successfully maintain copyright to the early UNIX versions or the BSD versions 2-4.4. Also in 1993, Judge Dickinson Debevoise's shared the opinion and again reaffirmed that USL "failed to demonstrate a likelihood that it can successfully defend its copyright in 32V" (that is the ancient Unix). http://tech-insider.org/usl-v-bsdi-ucb/research/1993/0303.html http://tech-insider.org/usl-v-bsdi-ucb/research/1993/0330.html BULLET=In 1995, Novell transfered some Unix rights to The SCO Group. As part of their agreement, multiple times. it specifically excluded copyrights. BULLET=SCO believed they purchased the Unix copyrights. BULLET= Novell contented it retained the copyrights ownership. In 1995, Novell intended to sell its Unix business. It transfered some Unix rights to The SCO Group. As part of their agreement, multiple times. it specifically excluded copyrights. SCO believed they purchased the Unix copyrights. Novell contented it retained the copyrights onwership. https://www.ca10.uscourts.gov/opinions/08/08-4217.pdf ############# TITLE=Who owns ancient Unix? (continued) IMAGE=images/Caldera-license.png BULLET=In 2001, SCO sold its Unix business, including its believed ownership of Unix copyrights, to Caldera. In 2001, SCO sold its Unix business, including its believed ownership of Unix copyrights, to Caldera. NOTE: SCO renamed itself to Tarantella CITE: https://web.archive.org/web/20071001003614/http://sec.edgar-online.com/2001/05/16/0001012870-01-500891/Section7.asp BULLET=In 2002, Caldera widely announced that the ancient Unix code (through 32V) were copyright by Caldera and licensed under an open source license. NOTE: They (assuming they owned it) gave the 1970's code away to the world. In 2002, Caldera widely announced that the ancient Unix code (through 32V) were copyright by Caldera and licensed under an open source license. They (assuming they owned it) gave the 1970's code away to the world. http://www.lemis.com/grog/UNIX/ http://www.lemis.com/grog/UNIX/ancient-source-all.pdf also at http://www.tuhs.org/Archive/Caldera-license.pdf BULLET=In 2002, Caldera changed its name to SCO. In 2002, Caldera changed its name to SCO. NOTE:the ancient Unixes were widely distributed In 2002 and soon later, the ancient Unixes were widely distributed and reused, under the copyright and license from Caldera. Some examples are at http://cvsweb.netbsd.org/bsdweb.cgi/src/external/bsd/pcc/dist/pcc/cc/cc/cc.c?rev=1.1&content-type=text/x-cvsweb-markup http://cvsweb.netbsd.org/bsdweb.cgi/src/games/ching/ching/ching.sh?rev=1.1&content-type=text/x-cvsweb-markup http://cvsweb.netbsd.org/bsdweb.cgi/src/usr.bin/spell/spellprog/spellprog.c?rev=1.1&content-type=text/x-cvsweb-markup http://cvsweb.netbsd.org/bsdweb.cgi/src/usr.bin/deroff/deroff.c?rev=1.1&content-type=text/x-cvsweb-markup Many many projects widely share and reuse this historical code, such as http://heirloom.sourceforge.net/ The various code is mirrored all over the internet. Note this effectively put copyrights and licenses on unchanged code that previously had no copyright and license. BULLET=The new "SCO" attempted to say they owned Unix rights. BULLET=Defendent's lawyers believed that it was unlikely that anyone could could successfully maintain copyright to the early Unix versions (based on 1993 opinion). The new "SCO" attempted to say they owned Unix rights even though they had given them away via open source licensing. They tried to challenge IBM regarding this. IBM's lawyers believed that it was unlikely that anyone could could successfully maintain copyright to the early Unix versions. BULLET=In 2003, Novell stated it did not transfer copyrights for Unix System V to Caldera and communicated it would support the open source (and Linux) communities implying it would not challenge use of that Unix code. NOTE:Probably because they knew earlier opinions indicated that couldn't challenge it. In 2003, Novell stated it did not transfer copyrights for Unix System V to Caldera and communicated it would support the open source (and Linux) communities implying it would not challenge use of that Unix code. (Probably because they knew earlier opinions indicated that couldn't challenge it.) https://web.archive.org/web/20030602195439/http://www.novell.com/news/press/archive/2003/05/pr03033.html BULLET= In 2007, a district court concluded that Novell was the owner of the Unix copyrights. In 2007, a district court concluded that Novell was the owner of the Unix copyrights. BULLET= In 2009. a district court affirmed again that Novell was the owner of the Unix copyrights. In 2009. a district court affirmed again that Novell was the owner of the Unix copyrights. BULLET= In 2010, a jury confirmed Novell's ownership of Unix and again Novell communicated its protection of the open source community use of that Unix code. In 2010, a jury confirmed Novell's ownership of Unix and again communicated its protection of the open source community use of that Unix code. https://www.microfocus.com/about/press-room/article/2010/utah-jury-confirms-novell-has-ownership-of-unix-copyrights/ BULLET=In 2011, a district court again affirmed Novell's copyright ownership. In 2011, a district court again affirmed Novell's copyright ownership. https://www.ca10.uscourts.gov/opinions/10/10-4122.pdf BULLET=In 2011, Novell is acquired by the Attachemate Group. In 2011, Novell is acquired by the Attachemate Group. BULLET=In 2014, Micro Focus acquires the Attachemate Group. In 2014, Micro Focus acquires the Attachemate Group. BULLET=Micro Focus's press-room website shares old 2010 news: "The jury's decision confirmed Novell's ownership of the UNIX copyrights, which SCO had asserted to own in its attack on Linux." https://www.microfocus.com/about/press-room/article/2010/utah-jury-confirms-novell-has-ownership-of-unix-copyrights/ As individuals and organizations distribute the 1970's Unix code they do based on the copyright and license of Caldera. But as you can see later years, it was stated multiple times that really Novell may be the owner of that code. Here is the situation summarized again: - No copyrights when copyright statements were required. (These non-copyrighted files are widely available today.) - Software was widely shipped and reused (This is easily seen today.) - Company that owned the rights to Unix couldn't really claim the copyrights because they didn't exist. (And that company doesn't really exist anymore. And even if they did they could never close up something that was given away for free already by them.) - Effectively with no copyright and their very wide distribution, they are like public domain. - The commercial Unixes are mostly rewrites or reimplementations of some of the historical Unix code. While some of the old code may exist there, it is very different. The last commercial Unix systems Solaris, AIX, and HP-UX are being phased out and have no interest in the 1970's Unix code. - Maybe Micro Focus owns the copyrights for the later Unix code. Micro Focus's purchased businesses had no recent interest in the ancient Unix code in last 24 years. One of those businesses explicitely communicated it would not pursue copyright litigation over the historic Unix source code (probably because they couldn't prove the old code was copyrighted). As far as I know, Micro Focus doesn't sell the software nor licensing for ancient Unix, but maybe newer Unix. https://supportline.microfocus.com/licensing/licensinghome.aspx https://supportline.microfocus.com/licensing/Unix1.asp https://supportline.microfocus.com/licensing/unixdeployment.asp?prod=unix https://community.microfocus.com/t5/Over-the-Back-Fence/Micro-Focus-s-stance-on-Ancient-UNIX-licensing/td-p/1946721 (Note again this code was open sourced because Caldera thought they bought the ownership from Novell, which now is a part of Micro Focus.) (A couple years ago) I got in contact with Stirling Adams, Associate General Counsel, Head of IP at Micro Focus. They will research it. I doubt they know about it :) (This was done because of a NetBSD license request regarding some of this licensed code.) I'd like to get Micro Focus to provide an additional statement somewhat similar to what Nokia did (but less restrictive) that it will will not assert any copyright rights on the 1970's Research Unix Editions. I have a feeling if the can understand it, that this may be opening up a can of worms. That is they won't like that caldera put their copyright on it. Basically I'd want Micro Focus just to acknowledge that it wasn't copyrighted and they won't assert any rights. (But what are the rules for EU in regards to making it public domain?) On the side, I also asked them about the commercial Unix editions that they may own from the 1980's. It would be interesting to know their interests there too. (2 Apr 2018) By the way, Nokia apparently owns the rights to the non-commercial "research" versions of Unix in the 1980s. They didn't open source it fully, but allow non-commercial use. http://www.tuhs.org/Archive/Distributions/Research/Dan_Cross_v8/statement_regarding_Unix_3-7-17.pdf From usotsuki at buric.co Sat May 16 18:30:13 2020 From: usotsuki at buric.co (Steve Nickolas) Date: Sat, 16 May 2020 04:30:13 -0400 (EDT) Subject: [TUHS] v7 K&R C In-Reply-To: <20200516003110.GA23382@server.rulingia.com> References: <202005141841.04EIfvEZ063529@tahoe.cs.Dartmouth.EDU> <20200515150122.GF30160@mcvoy.com> <014001d62af3$9cc209b0$d6461d10$@ronnatalie.com> <20200516003110.GA23382@server.rulingia.com> Message-ID: On Sat, 16 May 2020, Peter Jeremy wrote: > On 2020-May-15 16:56:42 -0400, Steve Nickolas wrote: >> Isn't it nonstandard (although I am aware of some compilers that do it) to >> default the type of char to unsigned? > > The standard allows "char" to be either signed or unsigned. The ARM ABI > defines char as unsigned. > > I recall that Lattice C on the M68K allowed either signed or unsigned char > via a flag. Setting it to "unsigned" generally produced faster code on > my Amiga, though some code assumed signed chars and broke. Borland did the same. CC65, I think, defaults to unsigned char, but it's missing some other features. It is, however, the closest (to my knowledge) that C on the 6502 gets to ANSI starndard. -uso. From don at DonHopkins.com Sat May 16 20:20:52 2020 From: don at DonHopkins.com (Don Hopkins) Date: Sat, 16 May 2020 12:20:52 +0200 Subject: [TUHS] A la carte menu of OO features or properties In-Reply-To: References: Message-ID: The properties of object oriented programming are better described as an la carte menu like a cafeteria buffet or smörgåsbord, than a point, line, or continuum. http://www.paulgraham.com/reesoo.html Paul Graham: Reese Re: OO (Jonathan Rees had a really interesting response to Why Arc isn't Especially Object-Oriented, which he has allowed me to reproduce here.) Here is an a la carte menu of features or properties that are related to these terms; I have heard OO defined to be many different subsets of this list. • Encapsulation - the ability to syntactically hide the implementation of a type. E.g. in C or Pascal you always know whether something is a struct or an array, but in CLU and Java you can hide the difference. • Protection - the inability of the client of a type to detect its implementation. This guarantees that a behavior-preserving change to an implementation will not break its clients, and also makes sure that things like passwords don't leak out. • Ad hoc polymorphism - functions and data structures with parameters that can take on values of many different types. • Parametric polymorphism - functions and data structures that parameterize over arbitrary values (e.g. list of anything). ML and Lisp both have this. Java doesn't quite because of its non-Object types. • Everything is an object - all values are objects. True in Smalltalk (?) but not in Java (because of int and friends). • All you can do is send a message (AYCDISAM) = Actors model - there is no direct manipulation of objects, only communication with (or invocation of) them. The presence of fields in Java violates this. • Specification inheritance = subtyping - there are distinct types known to the language with the property that a value of one type is as good as a value of another for the purposes of type correctness. (E.g. Java interface inheritance.) • Implementation inheritance/reuse - having written one pile of code, a similar pile (e.g. a superset) can be generated in a controlled manner, i.e. the code doesn't have to be copied and edited. A limited and peculiar kind of abstraction. (E.g. Java class inheritance.) • Sum-of-product-of-function pattern - objects are (in effect) restricted to be functions that take as first argument a distinguished method key argument that is drawn from a finite set of simple names. […] (See the web page and the original thread for a discussion of which languages implement which of the above features.) http://www.paulgraham.com/reesoo.html https://web.archive.org/web/20160308032317/http://www.eros-os.org/pipermail/e-lang/2001-October/005852.html -Don -------------- next part -------------- An HTML attachment was scrubbed... URL: From crossd at gmail.com Sun May 17 02:14:28 2020 From: crossd at gmail.com (Dan Cross) Date: Sat, 16 May 2020 12:14:28 -0400 Subject: [TUHS] v7 K&R C In-Reply-To: <0A2C62EF-E43E-43E4-8C53-CE4C99BC5B32@coraid.com> References: <202005141841.04EIfvEZ063529@tahoe.cs.Dartmouth.EDU> <20200515150122.GF30160@mcvoy.com> <014001d62af3$9cc209b0$d6461d10$@ronnatalie.com> <019701d62af6$050344b0$0f09ce10$@ronnatalie.com> <0A2C62EF-E43E-43E4-8C53-CE4C99BC5B32@coraid.com> Message-ID: On Fri, May 15, 2020 at 8:58 PM Brantley Coile wrote: > I always kept local, single characters in ints. This avoided the problem > with loading a character being signed or unsigned. The reason for not > specifying is obvious. Today, you can pick the move-byte-into-word > instruction that either sign extends or doesn't. But when C was defined > that wasn't the case. Some machines sign extended when a byte was loaded > into a register and some filled the upper bits with zero. For machines that > filled with zero, a char was unsigned. If you forced the language to do one > or the other, it would be expensive on the opposite kind of machine. > Not only that, but if one used an exactly `char`-width value to hold, er, character data as returned from `getchar` et al, then one would necessarily give up the possibility of handling whatever character value was chosen for the sentinel marking end-of-input stream. `getchar` et al are defined to return EOF on end of input; if they didn't return a wider type than `char`, there would be data that could not be read. On probably every machine I am ever likely to use again in my lifetime, byte value 255 would be -1 as a signed char, but it is also a perfect valid value for a byte. The details of whether char is signed or unsigned aside, use of a wider type is necessary for correctness and ability to completely represent the input data. It's one of the things that made C a good choice on a wide variety of > machines. > > I guess I always "saw" the return value of the getchar() as being in a int > sized register, at first namely R0, so kept the character values returned > as ints. The actual EOF indication from a read is a return value of zero > for the number of characters read. > That's certainly true. Had C supported multiple return values or some kind of option type from the outset, it might have been that `getchar`, read, etc, returned a pair with some useful value (e.g., for `getchar` the value of the byte read; for `read` a length) and some indication of an error/EOF/OK value etc. Notably, both Go and Rust support essentially this: in Go, `io.Read()` returns a `(int, error)` pair, and the error is `io.EOF` on end-of-input; in Rust, the `read` method of the `Read` trait returns a `Result`, though a `Result::Ok(n)`, where `n==0` indicates EOF. But I'm just making noise because I'm sure everyone knows all this. > I think it's worthwhile stating these things explicitly, sometimes. - Dan C. > On May 15, 2020, at 4:18 PM, ron at ronnatalie.com wrote: > > > > EOF is defined to be -1. > > getchar() returns int, but c is a unsigned char, the value of (c = > getchar()) will be 255. This will never compare equal to -1. > > > > > > > > Ron, > > > > Hmmm... getchar/getc are defined as returning int in the man page and C > is traditionally defined as an int in this code.. > > > > On Fri, May 15, 2020 at 4:02 PM wrote: > >> Unfortunately, if c is char on a machine with unsigned chars, or it’s > of type unsigned char, the EOF will never be detected. > >> > >> > >> > >>> • while ((c = getchar()) != EOF) if (c == '\n') { /* entire record > is now there */ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From paul.winalski at gmail.com Sun May 17 02:28:01 2020 From: paul.winalski at gmail.com (Paul Winalski) Date: Sat, 16 May 2020 12:28:01 -0400 Subject: [TUHS] v7 K&R C In-Reply-To: References: <202005141841.04EIfvEZ063529@tahoe.cs.Dartmouth.EDU> <20200515150122.GF30160@mcvoy.com> <014001d62af3$9cc209b0$d6461d10$@ronnatalie.com> Message-ID: > On Fri, May 15, 2020 at 4:02 PM wrote: > >Unfortunately, if c is char on a machine with unsigned chars, or it’s of >type unsigned char, the EOF will never be detected. > > - while ((c = getchar()) != EOF) if (c == '\n') { /* entire record is now there */ The function prototype for getchar() is: int getchar(void); It returns an int, not a char. In all likelihood this is specifically *because* EOF is defined as -1. The above code works fine if c is an int. One always has to be very careful when doing a typecast of a function return value. -Paul W. From paul.winalski at gmail.com Sun May 17 02:31:06 2020 From: paul.winalski at gmail.com (Paul Winalski) Date: Sat, 16 May 2020 12:31:06 -0400 Subject: [TUHS] v7 K&R C In-Reply-To: References: Message-ID: On 5/15/20, Warner Losh wrote: > > arm has been the biggest outlier in terms of unsigned char. In FreeBSD, > this has been the second largest source of bugs with the platform... the > OABI weird alignment requirements being the first (thankfully behind us)... Why did the implementers of the Unix ABI for ARM decide to have char be unsigned? Was there an architectural reason for it? -Paul W. From imp at bsdimp.com Sun May 17 03:39:54 2020 From: imp at bsdimp.com (Warner Losh) Date: Sat, 16 May 2020 11:39:54 -0600 Subject: [TUHS] v7 K&R C In-Reply-To: References: <202005141841.04EIfvEZ063529@tahoe.cs.Dartmouth.EDU> <20200515150122.GF30160@mcvoy.com> <014001d62af3$9cc209b0$d6461d10$@ronnatalie.com> Message-ID: On Sat, May 16, 2020 at 10:28 AM Paul Winalski wrote: > > On Fri, May 15, 2020 at 4:02 PM wrote: > > > >Unfortunately, if c is char on a machine with unsigned chars, or it’s of > >type unsigned char, the EOF will never be detected. > > > > - while ((c = getchar()) != EOF) if (c == '\n') { /* entire record is > now there */ > > The function prototype for getchar() is: int getchar(void); > > It returns an int, not a char. In all likelihood this is specifically > *because* EOF is defined as -1. The above code works fine if c is an > int. One always has to be very careful when doing a typecast of a > function return value. > In the early days of my involvement with FreeBSD, I went through and fixed about a dozen cases where getopt was being assigned to a char and then compared with EOF. I'm certain that this is why. Also EOF has to be a value that's not representable by a character, or your 0xff bytes would disappear. Warner -------------- next part -------------- An HTML attachment was scrubbed... URL: From aap at papnet.eu Sun May 17 04:14:40 2020 From: aap at papnet.eu (Angelo Papenhoff) Date: Sat, 16 May 2020 20:14:40 +0200 Subject: [TUHS] 11/40 CPU (Emulation) test programs still available somewhere In-Reply-To: <15E49B16-E633-47B4-9CFB-42C863559CA4@me.com> References: <15E49B16-E633-47B4-9CFB-42C863559CA4@me.com> Message-ID: <20200516181440.GA9805@indra.papnet.eu> On 15/05/20, Michael Stiller via TUHS wrote: > So the question is, are there programs V6 runnable or rk bootable available which test the CPU > functionality? I already hat a look at some xxdp disk, but the one i found (while they boot) > seem to lack basic cpu tests. I wrote an 11/40 emulator too recently and used the diagnostics to test it (in fact, that's all it has run so far..) This is my testset: https://github.com/aap/pdp11/blob/master/1140.c#L244 Hope this helps, aap From aap at papnet.eu Sun May 17 04:14:40 2020 From: aap at papnet.eu (Angelo Papenhoff) Date: Sat, 16 May 2020 20:14:40 +0200 Subject: [TUHS] 11/40 CPU (Emulation) test programs still available somewhere In-Reply-To: <15E49B16-E633-47B4-9CFB-42C863559CA4@me.com> References: <15E49B16-E633-47B4-9CFB-42C863559CA4@me.com> Message-ID: <20200516181440.GA9805@indra.papnet.eu> On 15/05/20, Michael Stiller via TUHS wrote: > So the question is, are there programs V6 runnable or rk bootable available which test the CPU > functionality? I already hat a look at some xxdp disk, but the one i found (while they boot) > seem to lack basic cpu tests. I wrote an 11/40 emulator too recently and used the diagnostics to test it (in fact, that's all it has run so far..) This is my testset: https://github.com/aap/pdp11/blob/master/1140.c#L244 Hope this helps, aap From mstiller at me.com Sun May 17 04:36:38 2020 From: mstiller at me.com (Michael Stiller) Date: Sat, 16 May 2020 20:36:38 +0200 Subject: [TUHS] 11/40 CPU (Emulation) test programs still available somewhere In-Reply-To: <20200516181440.GA9805@indra.papnet.eu> References: <15E49B16-E633-47B4-9CFB-42C863559CA4@me.com> <20200516181440.GA9805@indra.papnet.eu> Message-ID: <75056802-3215-4E03-AEFE-F3681ACC722C@me.com> Hi Angelo, i noticed your project recently and found it very interesting, but did not hat the time to look at it in more detail. Maybe now it’s time to have a more in depth look. Thanks for reminding me. Best regards, Michael > On 16. May 2020, at 20:14, Angelo Papenhoff wrote: > > On 15/05/20, Michael Stiller via TUHS wrote: >> So the question is, are there programs V6 runnable or rk bootable available which test the CPU >> functionality? I already hat a look at some xxdp disk, but the one i found (while they boot) >> seem to lack basic cpu tests. > > I wrote an 11/40 emulator too recently and used the diagnostics to > test it (in fact, that's all it has run so far..) > This is my testset: https://github.com/aap/pdp11/blob/master/1140.c#L244 > > Hope this helps, > aap From mstiller at me.com Sun May 17 04:36:38 2020 From: mstiller at me.com (Michael Stiller) Date: Sat, 16 May 2020 20:36:38 +0200 Subject: [TUHS] 11/40 CPU (Emulation) test programs still available somewhere In-Reply-To: <20200516181440.GA9805@indra.papnet.eu> References: <15E49B16-E633-47B4-9CFB-42C863559CA4@me.com> <20200516181440.GA9805@indra.papnet.eu> Message-ID: <75056802-3215-4E03-AEFE-F3681ACC722C@me.com> Hi Angelo, i noticed your project recently and found it very interesting, but did not hat the time to look at it in more detail. Maybe now it’s time to have a more in depth look. Thanks for reminding me. Best regards, Michael > On 16. May 2020, at 20:14, Angelo Papenhoff wrote: > > On 15/05/20, Michael Stiller via TUHS wrote: >> So the question is, are there programs V6 runnable or rk bootable available which test the CPU >> functionality? I already hat a look at some xxdp disk, but the one i found (while they boot) >> seem to lack basic cpu tests. > > I wrote an 11/40 emulator too recently and used the diagnostics to > test it (in fact, that's all it has run so far..) > This is my testset: https://github.com/aap/pdp11/blob/master/1140.c#L244 > > Hope this helps, > aap From richard at inf.ed.ac.uk Sun May 17 04:45:16 2020 From: richard at inf.ed.ac.uk (Richard Tobin) Date: Sat, 16 May 2020 19:45:16 +0100 (BST) Subject: [TUHS] v7 K&R C In-Reply-To: Paul Winalski's message of Sat, 16 May 2020 12:28:01 -0400 Message-ID: <20200516184516.EBFC22D2F140@macaroni.inf.ed.ac.uk> > The function prototype for getchar() is: int getchar(void); > > It returns an int, not a char. In all likelihood this is specifically > *because* EOF is defined as -1. It would have probably returned int anyway, because of the automatic promotion of char to int in expressions. It was natural to declare functions returning char as int, if you bothered to declare them at all. As K&R1 said: Since char promotes to int in expressions, there is no need to declare functions that return char. Similarly functions that might return short or float would normally return int or double; there aren't separate atof and atod functions for example. -- Richard -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. From imp at bsdimp.com Sun May 17 06:37:15 2020 From: imp at bsdimp.com (Warner Losh) Date: Sat, 16 May 2020 14:37:15 -0600 Subject: [TUHS] v7 K&R C In-Reply-To: References: Message-ID: On Sat, May 16, 2020 at 2:35 PM Brad Spencer wrote: > Paul Winalski writes: > > > On 5/15/20, Warner Losh wrote: > >> > >> arm has been the biggest outlier in terms of unsigned char. In FreeBSD, > >> this has been the second largest source of bugs with the platform... the > >> OABI weird alignment requirements being the first (thankfully behind > us)... > > > > Why did the implementers of the Unix ABI for ARM decide to have char > > be unsigned? Was there an architectural reason for it? > > > > -Paul W. > > > My understanding is that it is a lot more efficient to use unsigned char > on arm. You can make gcc, for example, deal with this, but it costs. I > remember having to tell gcc to deal with it when I ported the Doom > engine to a StrongARM processor device under NetBSD many years ago. I > mostly remember the code running well enough, but it was larger. > I've seen numbers that suggest it's about 10% smaller to use unsigned characters, and the code runs 5-10% faster. I've not looked at the generated code to understand why, exactly, that might be the case. Warner -------------- next part -------------- An HTML attachment was scrubbed... URL: From brad at anduin.eldar.org Sun May 17 06:35:24 2020 From: brad at anduin.eldar.org (Brad Spencer) Date: Sat, 16 May 2020 16:35:24 -0400 Subject: [TUHS] v7 K&R C In-Reply-To: (message from Paul Winalski on Sat, 16 May 2020 12:31:06 -0400) Message-ID: Paul Winalski writes: > On 5/15/20, Warner Losh wrote: >> >> arm has been the biggest outlier in terms of unsigned char. In FreeBSD, >> this has been the second largest source of bugs with the platform... the >> OABI weird alignment requirements being the first (thankfully behind us)... > > Why did the implementers of the Unix ABI for ARM decide to have char > be unsigned? Was there an architectural reason for it? > > -Paul W. My understanding is that it is a lot more efficient to use unsigned char on arm. You can make gcc, for example, deal with this, but it costs. I remember having to tell gcc to deal with it when I ported the Doom engine to a StrongARM processor device under NetBSD many years ago. I mostly remember the code running well enough, but it was larger. -- Brad Spencer - brad at anduin.eldar.org - KC8VKS - http://anduin.eldar.org From ron at ronnatalie.com Sun May 17 07:55:50 2020 From: ron at ronnatalie.com (Ronald Natalie) Date: Sat, 16 May 2020 17:55:50 -0400 Subject: [TUHS] v7 K&R C In-Reply-To: <20200516184516.EBFC22D2F140@macaroni.inf.ed.ac.uk> References: <20200516184516.EBFC22D2F140@macaroni.inf.ed.ac.uk> Message-ID: <252B4CA1-A67C-4C6E-8AE5-56ED6D4746A0@ronnatalie.com> It would have to be something bigger than char because you need EOF (whatever it could be defined as) to be distinct from any character. > On May 16, 2020, at 2:45 PM, Richard Tobin wrote: > >> The function prototype for getchar() is: int getchar(void); >> >> It returns an int, not a char. In all likelihood this is specifically >> *because* EOF is defined as -1. > > It would have probably returned int anyway, because of the automatic > promotion of char to int in expressions. It was natural to declare > functions returning char as int, if you bothered to declare them at > all. As K&R1 said: > > Since char promotes to int in expressions, there is no need > to declare functions that return char. > > Similarly functions that might return short or float would normally > return int or double; there aren't separate atof and atod functions > for example. > > -- Richard > > -- > The University of Edinburgh is a charitable body, registered in > Scotland, with registration number SC005336. > From ron at ronnatalie.com Sun May 17 07:59:45 2020 From: ron at ronnatalie.com (Ronald Natalie) Date: Sat, 16 May 2020 17:59:45 -0400 Subject: [TUHS] v7 K&R C In-Reply-To: <20200515233427.31Vab%steffen@sdaoden.eu> References: <20200515213138.8E0F72D2D71E@macaroni.inf.ed.ac.uk> <077a01d62b08$e696bee0$b3c43ca0$@ronnatalie.com> <20200515233427.31Vab%steffen@sdaoden.eu> Message-ID: <5DB09C5A-F5DA-4375-AAA5-0711FC6FB1D9@ronnatalie.com> The issue is making char play double duty as a basic storage unit and a native character. This means you can never have 16 (or 32 bit) chars on any machine that you wanted to support 8 bit integers. > On May 15, 2020, at 7:34 PM, Steffen Nurpmeso wrote: > > ron at ronnatalie.com wrote in > <077a01d62b08$e696bee0$b3c43ca0$@ronnatalie.com>: > |Char is different. One of the silly foibles of C. char can be signed or > |unsigned at the implementation's decision. > > And i would wish Thompson and Pike would have felt the need to > design UTF-8 ten years earlier. Maybe we would have a halfway > usable "wide" character interface in the standard (C) library. > > --steffen > | > |Der Kragenbaer, The moon bear, > |der holt sich munter he cheerfully and one by one > |einen nach dem anderen runter wa.ks himself off > |(By Robert Gernhardt) From steffen at sdaoden.eu Sun May 17 09:26:07 2020 From: steffen at sdaoden.eu (Steffen Nurpmeso) Date: Sun, 17 May 2020 01:26:07 +0200 Subject: [TUHS] v7 K&R C In-Reply-To: <5DB09C5A-F5DA-4375-AAA5-0711FC6FB1D9@ronnatalie.com> References: <20200515213138.8E0F72D2D71E@macaroni.inf.ed.ac.uk> <077a01d62b08$e696bee0$b3c43ca0$@ronnatalie.com> <20200515233427.31Vab%steffen@sdaoden.eu> <5DB09C5A-F5DA-4375-AAA5-0711FC6FB1D9@ronnatalie.com> Message-ID: <20200516232607.nLiIx%steffen@sdaoden.eu> Ronald Natalie wrote in <5DB09C5A-F5DA-4375-AAA5-0711FC6FB1D9 at ronnatalie.com>: |> On May 15, 2020, at 7:34 PM, Steffen Nurpmeso wrote: |> ron at ronnatalie.com wrote in |> <077a01d62b08$e696bee0$b3c43ca0$@ronnatalie.com>: |>|Char is different. One of the silly foibles of C. char can be \ |>|signed or |>|unsigned at the implementation's decision. |> |> And i would wish Thompson and Pike would have felt the need to |> design UTF-8 ten years earlier. Maybe we would have a halfway |> usable "wide" character interface in the standard (C) library. |The issue is making char play double duty as a basic storage unit and \ |a native character. |This means you can never have 16 (or 32 bit) chars on any machine that \ |you wanted to support 8 bit integers. Oh, I am not the person to step in here. [I deleted 60+ lines of char*/void*, and typedefs, etc. experiences i had. And POSIX specifying that a byte has 8-bit. And soon that NULL/(void*)0 has all bits 0.] Unicode / ISO 10646 did not exist by then. sure. I am undecided. I was a real fan of UTF-32 (32-bit character) at times, but when i looked more deeply in Unicode, it turned out to be false thinking: some languages are so complex that you need to address entire sentences, or at least encapsulate "graphem" boundaries, going for "codepoints" is just wrong. Then i thought Microsoft and their UTF-16 decision was not that bad, because almost all real life characters of Unicode can nonetheless be addressed by a single 16-bit codepoint, and that eases programming. But moreover UTF-8 needs three bytes for most of them. Why did it happen? Why was the char type overloaded like this? Why was there no byte or "mem" type? It is to this day, i think, that ISO C allows to bypass their (terrible) aliasing rules by casting to and from char*. In v5 usr/src/s2/mail.c i see +getfield(buf) +char buf[]; +{ + int j; + char c; + + j = 0; + while((c = buf[j] = getc(iobuf)) >= 0) + if(c==':' || c=='\n') { + buf[j] =0; + return(1); + } else + j++; + return(0); +} so here the EOF was different and char was signed 7-bit it seems. At that time at latest i have to admit that i have not looked in old source code for years. But just had a quick look in the dmr/ of 5th revision, and there you see "char lowbyte", for example. A nice Sunday from Germany! i wish you, and the list, --steffen | |Der Kragenbaer, The moon bear, |der holt sich munter he cheerfully and one by one |einen nach dem anderen runter wa.ks himself off |(By Robert Gernhardt) From steffen at sdaoden.eu Sun May 17 09:53:08 2020 From: steffen at sdaoden.eu (Steffen Nurpmeso) Date: Sun, 17 May 2020 01:53:08 +0200 Subject: [TUHS] v7 K&R C In-Reply-To: References: <20200511005745.GL17035@mcvoy.com> <357EFE54-BD94-4C10-8C43-C6735BF7D317@via.net> <20200511202555.GU17035@mcvoy.com> <20200514173206.GJ20771@mcvoy.com> Message-ID: <20200516235308.icuQH%steffen@sdaoden.eu> Tony Finch wrote in : |Larry McVoy wrote: |> |> It's got some perl goodness, regexps are part of the syntax, .... | |I got into Unix after perl and I've used it a lot. Back in the 1990s I saw |Henry Spencer's joke that perl was the Swiss Army Chainsaw of Unix, as a |riff on lex being its Swiss Army Knife. I came to appreciate lex |regrettably late: lex makes it remarkably easy to chew through a huge pile |of text and feed the pieces to some library code written in C. I've been |using re2c recently (http://re2c.org/), which is differently weird than |lex, though it still uses YY in all its variable names. It's remarkable |how much newer lexer/parser generators can't escape from the user |interface of lex/yacc. Another YY example: http://www.hwaci.com/sw/lemon/ P.S.: i really hate automated lexers. I never ever got used to use them. For learning i once tried to use flex/bison, but i failed really hard. I like that blood, sweat and tears thing, and using a lexer seems so shattered, all the pieces. And i find them really hard to read. If you can deal with them they are surely a relief, especially in rapidly moving syntax situations. But if i look at settled source code which uses it, for example usr.sbin/ospfd/parse.y, or usr.sbin/smtpd/parse.y, both of OpenBSD, then i feel lost and am happy that i do not need to maintain that code. --steffen | |Der Kragenbaer, The moon bear, |der holt sich munter he cheerfully and one by one |einen nach dem anderen runter wa.ks himself off |(By Robert Gernhardt) From jon at fourwinds.com Sun May 17 09:59:36 2020 From: jon at fourwinds.com (Jon Steinhart) Date: Sat, 16 May 2020 16:59:36 -0700 Subject: [TUHS] v7 K&R C [really lexers] In-Reply-To: <20200516235308.icuQH%steffen@sdaoden.eu> References: <20200511005745.GL17035@mcvoy.com> <357EFE54-BD94-4C10-8C43-C6735BF7D317@via.net> <20200511202555.GU17035@mcvoy.com> <20200514173206.GJ20771@mcvoy.com> <20200516235308.icuQH%steffen@sdaoden.eu> Message-ID: <202005162359.04GNxalN3783011@darkstar.fourwinds.com> Steffen Nurpmeso writes: > Tony Finch wrote in > : > |Larry McVoy wrote: > |> > |> It's got some perl goodness, regexps are part of the syntax, .... > | > |I got into Unix after perl and I've used it a lot. Back in the 1990s I saw > |Henry Spencer's joke that perl was the Swiss Army Chainsaw of Unix, as a > |riff on lex being its Swiss Army Knife. I came to appreciate lex > |regrettably late: lex makes it remarkably easy to chew through a huge pile > |of text and feed the pieces to some library code written in C. I've been > |using re2c recently (http://re2c.org/), which is differently weird than > |lex, though it still uses YY in all its variable names. It's remarkable > |how much newer lexer/parser generators can't escape from the user > |interface of lex/yacc. Another YY example: http://www.hwaci.com/sw/lemon/ > > P.S.: i really hate automated lexers. I never ever got used to > use them. For learning i once tried to use flex/bison, but > i failed really hard. I like that blood, sweat and tears thing, > and using a lexer seems so shattered, all the pieces. And i find > them really hard to read. > > If you can deal with them they are surely a relief, especially in > rapidly moving syntax situations. But if i look at settled source > code which uses it, for example usr.sbin/ospfd/parse.y, or > usr.sbin/smtpd/parse.y, both of OpenBSD, then i feel lost and am > happy that i do not need to maintain that code. > > --steffen Wow, I've had the opposite experience. I find lex/yacc/flex/bison really easy to use. The issue, which I believe was covered in the early docs, is that some languages are not designed with regularity in mind which makes for ugly code. But to be fair, that code is at least as ugly with hand-crafted code. I believe that the original wisecrack was directed towards FORTRAN. My ancient experience was that it was using lex/yacc for HSPICE was not going to work so I had to hand-craft code for that. Jon From brantley at coraid.com Sun May 17 10:04:57 2020 From: brantley at coraid.com (Brantley Coile) Date: Sun, 17 May 2020 00:04:57 +0000 Subject: [TUHS] v7 K&R C [really lexers] In-Reply-To: <202005162359.04GNxalN3783011@darkstar.fourwinds.com> References: <20200511005745.GL17035@mcvoy.com> <357EFE54-BD94-4C10-8C43-C6735BF7D317@via.net> <20200511202555.GU17035@mcvoy.com> <20200514173206.GJ20771@mcvoy.com> <20200516235308.icuQH%steffen@sdaoden.eu>, <202005162359.04GNxalN3783011@darkstar.fourwinds.com> Message-ID: “The asteroid to kill this dinosaur is still in orbit.“ —- Plan 9 lex man page I always hand craft my lexers and use yacc to parse. Most code on plan 9 does that as well. Brantley On May 16, 2020, at 8:00 PM, Jon Steinhart wrote: Steffen Nurpmeso writes: Tony Finch wrote in : |Larry McVoy wrote: |> |> It's got some perl goodness, regexps are part of the syntax, .... | |I got into Unix after perl and I've used it a lot. Back in the 1990s I saw |Henry Spencer's joke that perl was the Swiss Army Chainsaw of Unix, as a |riff on lex being its Swiss Army Knife. I came to appreciate lex |regrettably late: lex makes it remarkably easy to chew through a huge pile |of text and feed the pieces to some library code written in C. I've been |using re2c recently (http://re2c.org/), which is differently weird than |lex, though it still uses YY in all its variable names. It's remarkable |how much newer lexer/parser generators can't escape from the user |interface of lex/yacc. Another YY example: http://www.hwaci.com/sw/lemon/ P.S.: i really hate automated lexers. I never ever got used to use them. For learning i once tried to use flex/bison, but i failed really hard. I like that blood, sweat and tears thing, and using a lexer seems so shattered, all the pieces. And i find them really hard to read. If you can deal with them they are surely a relief, especially in rapidly moving syntax situations. But if i look at settled source code which uses it, for example usr.sbin/ospfd/parse.y, or usr.sbin/smtpd/parse.y, both of OpenBSD, then i feel lost and am happy that i do not need to maintain that code. --steffen Wow, I've had the opposite experience. I find lex/yacc/flex/bison really easy to use. The issue, which I believe was covered in the early docs, is that some languages are not designed with regularity in mind which makes for ugly code. But to be fair, that code is at least as ugly with hand-crafted code. I believe that the original wisecrack was directed towards FORTRAN. My ancient experience was that it was using lex/yacc for HSPICE was not going to work so I had to hand-craft code for that. Jon -------------- next part -------------- An HTML attachment was scrubbed... URL: From lm at mcvoy.com Sun May 17 10:35:59 2020 From: lm at mcvoy.com (Larry McVoy) Date: Sat, 16 May 2020 17:35:59 -0700 Subject: [TUHS] v7 K&R C In-Reply-To: <20200516235308.icuQH%steffen@sdaoden.eu> References: <357EFE54-BD94-4C10-8C43-C6735BF7D317@via.net> <20200511202555.GU17035@mcvoy.com> <20200514173206.GJ20771@mcvoy.com> <20200516235308.icuQH%steffen@sdaoden.eu> Message-ID: <20200517003559.GC19945@mcvoy.com> On Sun, May 17, 2020 at 01:53:08AM +0200, Steffen Nurpmeso wrote: > Tony Finch wrote in > : > |Larry McVoy wrote: > |> > |> It's got some perl goodness, regexps are part of the syntax, .... > | > |I got into Unix after perl and I've used it a lot. Back in the 1990s I saw > |Henry Spencer's joke that perl was the Swiss Army Chainsaw of Unix, as a > |riff on lex being its Swiss Army Knife. I came to appreciate lex > |regrettably late: lex makes it remarkably easy to chew through a huge pile > |of text and feed the pieces to some library code written in C. I've been > |using re2c recently (http://re2c.org/), which is differently weird than > |lex, though it still uses YY in all its variable names. It's remarkable > |how much newer lexer/parser generators can't escape from the user > |interface of lex/yacc. Another YY example: http://www.hwaci.com/sw/lemon/ > > P.S.: i really hate automated lexers. I never ever got used to > use them. For learning i once tried to use flex/bison, but > i failed really hard. I like that blood, sweat and tears thing, > and using a lexer seems so shattered, all the pieces. And i find > them really hard to read. They are not bad if you are good at it. One of my guys has a PhD in compilers and he's good at it. They are not good at performance. BitKeeper has an extensive printf like (sort of, different syntax) language that can be used to customize log output. Rob originally did all that in flex/bison but the performance started to hurt so he rewrote it all: /* * This is a recursive-descent parser that implements the following * grammar for dspecs (where [[...]] indicates an optional clause * and {{...}} indicates 0 or more repetitions of): * * -> {{ }} * -> $if(){}[[$else{}]] * -> $unless(){}[[$else{}]] * -> $each(:ID:){} * -> ${=} * -> * -> {{ }} * -> * -> * -> () * -> ! * -> {{ }} * -> char * -> escaped_char * -> :ID: * -> (:ID:) * -> $ * -> " && " | " || " * -> "=" | "!=" | "=~" * -> " -eq " | " -ne " | " -gt " | " -ge " | " -lt " | " -le " * * This grammar is ambiguous due to (:ID:) loooking like a * parenthesized sub-expression. The code tries to parse (:ID:) first * as an $each variable, then as a regular :ID:, then as regular text. * * Note that this is broken: $if((:MERGE:)){:REV:} * * The following procedures can be thought of as implementing an * attribute grammar where the output parameters are synthesized * attributes which hold the expression values and the next token * of lookahead in some cases. It has been written for speed. * * NOTE: out==0 means evaluate but throw away. * * Written by Rob Netzer with some hacking * by wscott & lm. */ That stuff screams perf wise. From imp at bsdimp.com Sun May 17 11:23:04 2020 From: imp at bsdimp.com (Warner Losh) Date: Sat, 16 May 2020 19:23:04 -0600 Subject: [TUHS] v7 K&R C [really lexers] In-Reply-To: References: <20200511005745.GL17035@mcvoy.com> <357EFE54-BD94-4C10-8C43-C6735BF7D317@via.net> <20200511202555.GU17035@mcvoy.com> <20200514173206.GJ20771@mcvoy.com> <20200516235308.icuQH%steffen@sdaoden.eu> <202005162359.04GNxalN3783011@darkstar.fourwinds.com> Message-ID: On Sat, May 16, 2020, 6:05 PM Brantley Coile wrote: > “The asteroid to kill this dinosaur is still in orbit.“ > > —- Plan 9 lex man page > > > I always hand craft my lexers and use yacc to parse. Most code on plan 9 > does that as well. > Wow! That is the most awesome thing I've seen in a while.... Warner Brantley > > > On May 16, 2020, at 8:00 PM, Jon Steinhart wrote: > > Steffen Nurpmeso writes: > > Tony Finch wrote in > > : > > |Larry McVoy wrote: > > |> > > |> It's got some perl goodness, regexps are part of the syntax, .... > > | > > |I got into Unix after perl and I've used it a lot. Back in the 1990s I saw > > |Henry Spencer's joke that perl was the Swiss Army Chainsaw of Unix, as a > > |riff on lex being its Swiss Army Knife. I came to appreciate lex > > |regrettably late: lex makes it remarkably easy to chew through a huge pile > > |of text and feed the pieces to some library code written in C. I've been > > |using re2c recently (http://re2c.org/), which is differently weird than > > |lex, though it still uses YY in all its variable names. It's remarkable > > |how much newer lexer/parser generators can't escape from the user > > |interface of lex/yacc. Another YY example: http://www.hwaci.com/sw/lemon/ > > > P.S.: i really hate automated lexers. I never ever got used to > > use them. For learning i once tried to use flex/bison, but > > i failed really hard. I like that blood, sweat and tears thing, > > and using a lexer seems so shattered, all the pieces. And i find > > them really hard to read. > > > If you can deal with them they are surely a relief, especially in > > rapidly moving syntax situations. But if i look at settled source > > code which uses it, for example usr.sbin/ospfd/parse.y, or > > usr.sbin/smtpd/parse.y, both of OpenBSD, then i feel lost and am > > happy that i do not need to maintain that code. > > > --steffen > > > Wow, I've had the opposite experience. I find lex/yacc/flex/bison really > easy to use. The issue, which I believe was covered in the early docs, > is that some languages are not designed with regularity in mind which makes > for ugly code. But to be fair, that code is at least as ugly with > hand-crafted > code. > > I believe that the original wisecrack was directed towards FORTRAN. My > ancient > experience was that it was using lex/yacc for HSPICE was not going to work > so I > had to hand-craft code for that. > > Jon > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From brantley at coraid.com Sun May 17 11:36:16 2020 From: brantley at coraid.com (Brantley Coile) Date: Sun, 17 May 2020 01:36:16 +0000 Subject: [TUHS] v7 K&R C [really lexers] In-Reply-To: References: <20200511005745.GL17035@mcvoy.com> <357EFE54-BD94-4C10-8C43-C6735BF7D317@via.net> <20200511202555.GU17035@mcvoy.com> <20200514173206.GJ20771@mcvoy.com> <20200516235308.icuQH%steffen@sdaoden.eu> <202005162359.04GNxalN3783011@darkstar.fourwinds.com> Message-ID: <9C240181-E905-4DA0-8702-34A9201CA77A@coraid.com> It looks like only grap and pic have mkfiles that invoke lex. > On May 16, 2020, at 9:23 PM, Warner Losh wrote: > > > > On Sat, May 16, 2020, 6:05 PM Brantley Coile wrote: > “The asteroid to kill this dinosaur is still in orbit.“ > —- Plan 9 lex man page > > I always hand craft my lexers and use yacc to parse. Most code on plan 9 does that as well. > > Wow! That is the most awesome thing I've seen in a while.... > > Warner > > > Brantley > > >> On May 16, 2020, at 8:00 PM, Jon Steinhart wrote: >> >> Steffen Nurpmeso writes: >>> Tony Finch wrote in >>> : >>> |Larry McVoy wrote: >>> |> >>> |> It's got some perl goodness, regexps are part of the syntax, .... >>> | >>> |I got into Unix after perl and I've used it a lot. Back in the 1990s I saw >>> |Henry Spencer's joke that perl was the Swiss Army Chainsaw of Unix, as a >>> |riff on lex being its Swiss Army Knife. I came to appreciate lex >>> |regrettably late: lex makes it remarkably easy to chew through a huge pile >>> |of text and feed the pieces to some library code written in C. I've been >>> |using re2c recently (http://re2c.org/), which is differently weird than >>> |lex, though it still uses YY in all its variable names. It's remarkable >>> |how much newer lexer/parser generators can't escape from the user >>> |interface of lex/yacc. Another YY example: http://www.hwaci.com/sw/lemon/ >>> >>> P.S.: i really hate automated lexers. I never ever got used to >>> use them. For learning i once tried to use flex/bison, but >>> i failed really hard. I like that blood, sweat and tears thing, >>> and using a lexer seems so shattered, all the pieces. And i find >>> them really hard to read. >>> >>> If you can deal with them they are surely a relief, especially in >>> rapidly moving syntax situations. But if i look at settled source >>> code which uses it, for example usr.sbin/ospfd/parse.y, or >>> usr.sbin/smtpd/parse.y, both of OpenBSD, then i feel lost and am >>> happy that i do not need to maintain that code. >>> >>> --steffen >> >> Wow, I've had the opposite experience. I find lex/yacc/flex/bison really >> easy to use. The issue, which I believe was covered in the early docs, >> is that some languages are not designed with regularity in mind which makes >> for ugly code. But to be fair, that code is at least as ugly with hand-crafted >> code. >> >> I believe that the original wisecrack was directed towards FORTRAN. My ancient >> experience was that it was using lex/yacc for HSPICE was not going to work so I >> had to hand-craft code for that. >> >> Jon From beebe at math.utah.edu Sun May 17 12:07:24 2020 From: beebe at math.utah.edu (Nelson H. F. Beebe) Date: Sat, 16 May 2020 20:07:24 -0600 Subject: [TUHS] v7 K&R C [really lexers] Message-ID: Brantley Coile wrote on Sun, 17 May 2020 01:36:16 +0000: >> It looks like only grap and pic have mkfiles that invoke lex. Both of those are Brian Kernighan's work, and from the FIXES file in his nawk, I can offer this quote: >> ... >> Aug 9, 1997: >> somewhat regretfully, replaced the ancient lex-based lexical >> analyzer with one written in C. it's longer, generates less code, >> and more portable; the old one depended too much on mysterious >> properties of lex that were not preserved in other environments. >> in theory these recognize the same language. >> ... ------------------------------------------------------------------------------- - Nelson H. F. Beebe Tel: +1 801 581 5254 - - University of Utah FAX: +1 801 581 4148 - - Department of Mathematics, 110 LCB Internet e-mail: beebe at math.utah.edu - - 155 S 1400 E RM 233 beebe at acm.org beebe at computer.org - - Salt Lake City, UT 84112-0090, USA URL: http://www.math.utah.edu/~beebe/ - ------------------------------------------------------------------------------- From arnold at skeeve.com Sun May 17 20:19:59 2020 From: arnold at skeeve.com (arnold at skeeve.com) Date: Sun, 17 May 2020 04:19:59 -0600 Subject: [TUHS] QED with Unicode support Message-ID: <202005171019.04HAJxHb030339@freefriends.org> Hello All In case this is of interest: > Date: Sat, 16 May 2020 12:18:15 +0100 > Subject: qed-archive (Github) > To: arnold at skeeve.com > > Hi Arnold > > Apologies for an un-solicited email. > > I have ported Rob's UofT QED from your qed-archive/unix-1992 to Linux > (although I suspect anything with a Unix API will work), and have updated > it to work completely in UTF8/Unicode. This includes correct handling of > unicode codepoints in regexes, subs, etc. > > It's up on Github under phonologus/QED. I thought you might want to add it > to your list of QED-s in your README, as it may be of interest to the other > three connoisseurs out there...! > > I have also included typeset pdf-s of the tutorial and manpage. The > tutorial is fascinating. > > Also, there is no clear statement about copyright, or license, of the > unix-1992 tarball. I have attributed it to Rob, and I have noted that I > scavenged it from your Github repo, but was wondering if there is any > definitive statement on ownership/authorship that I could include in my > repo? > > Best wishes From dfawcus+lists-tuhs at employees.org Mon May 18 02:10:55 2020 From: dfawcus+lists-tuhs at employees.org (Derek Fawcus) Date: Sun, 17 May 2020 17:10:55 +0100 Subject: [TUHS] v7 K&R C In-Reply-To: <20200515213138.8E0F72D2D71E@macaroni.inf.ed.ac.uk> References: <20200515213138.8E0F72D2D71E@macaroni.inf.ed.ac.uk> Message-ID: <20200517161055.GA5127@clarinet.employees.org> On Fri, May 15, 2020 at 10:31:38PM +0100, Richard Tobin wrote: > "The implementation shall define char to have the same range, > representation, and behavior as either signed char or unsigned char." > - C99 > > (Technically it's a separate type from both of them.) I was about to suggest I'd yet to come across a compiler which handled them that way, but on checking I find that both clang and gcc do now in effect have 3 types. i.e. both 'unsigned char *' and 'signed char *' values passed to a function taking 'char *' raises a warning. I wonder when they started doing that? DF From ron at ronnatalie.com Mon May 18 02:14:48 2020 From: ron at ronnatalie.com (ron at ronnatalie.com) Date: Sun, 17 May 2020 12:14:48 -0400 Subject: [TUHS] v7 K&R C In-Reply-To: <20200517161055.GA5127@clarinet.employees.org> References: <20200515213138.8E0F72D2D71E@macaroni.inf.ed.ac.uk> <20200517161055.GA5127@clarinet.employees.org> Message-ID: <064701d62c66$4b084d90$e118e8b0$@ronnatalie.com> It technically probably always should have. Void* (which has the same format as char*) would have accepted either type pointer, char* shouldn't, though I suspect that early compilers that predate void* would have happily converted any pointer to char* (or int for that matter). -----Original Message----- From: TUHS On Behalf Of Derek Fawcus Sent: Sunday, May 17, 2020 12:11 PM To: tuhs at tuhs.org Subject: Re: [TUHS] v7 K&R C On Fri, May 15, 2020 at 10:31:38PM +0100, Richard Tobin wrote: > "The implementation shall define char to have the same range, > representation, and behavior as either signed char or unsigned char." > - C99 > > (Technically it's a separate type from both of them.) I was about to suggest I'd yet to come across a compiler which handled them that way, but on checking I find that both clang and gcc do now in effect have 3 types. i.e. both 'unsigned char *' and 'signed char *' values passed to a function taking 'char *' raises a warning. I wonder when they started doing that? DF From paul.winalski at gmail.com Mon May 18 02:24:13 2020 From: paul.winalski at gmail.com (Paul Winalski) Date: Sun, 17 May 2020 12:24:13 -0400 Subject: [TUHS] v7 K&R C In-Reply-To: <20200516232607.nLiIx%steffen@sdaoden.eu> References: <20200515213138.8E0F72D2D71E@macaroni.inf.ed.ac.uk> <077a01d62b08$e696bee0$b3c43ca0$@ronnatalie.com> <20200515233427.31Vab%steffen@sdaoden.eu> <5DB09C5A-F5DA-4375-AAA5-0711FC6FB1D9@ronnatalie.com> <20200516232607.nLiIx%steffen@sdaoden.eu> Message-ID: On 5/16/20, Steffen Nurpmeso wrote: > > Why was there no byte or "mem" type? These days machine architecture has settled on the 8-bit byte as the unit for addressing, but it wasn't always the case. The PDP-10 addressed memory in 36-bit units. The character manipulating instructions could deal with a variety of different byte lengths: you could store six 6-bit BCD characters per machine word, or five ASCII 7-bit characters (with a bit left over), or four 8-bit characters (ASCII plus parity, with four bits left over), or four 9-bit characters. Regarding a "mem" type, take a look at BLISS. The only data type that language has is the machine word. > +getfield(buf) > +char buf[]; > +{ > + int j; > + char c; > + > + j = 0; > + while((c = buf[j] = getc(iobuf)) >= 0) > + if(c==':' || c=='\n') { > + buf[j] =0; > + return(1); > + } else > + j++; > + return(0); > +} > > so here the EOF was different and char was signed 7-bit it seems. That makes perfect sense if you're dealing with ASCII, which is a 7-bit character set. -Paul W. From ron at ronnatalie.com Mon May 18 02:29:33 2020 From: ron at ronnatalie.com (ron at ronnatalie.com) Date: Sun, 17 May 2020 12:29:33 -0400 Subject: [TUHS] v7 K&R C In-Reply-To: References: <20200515213138.8E0F72D2D71E@macaroni.inf.ed.ac.uk> <077a01d62b08$e696bee0$b3c43ca0$@ronnatalie.com> <20200515233427.31Vab%steffen@sdaoden.eu> <5DB09C5A-F5DA-4375-AAA5-0711FC6FB1D9@ronnatalie.com> <20200516232607.nLiIx%steffen@sdaoden.eu> Message-ID: <065a01d62c68$59b7d890$0d2789b0$@ronnatalie.com> > > so here the EOF was different and char was signed 7-bit it seems. That makes perfect sense if you're dealing with ASCII, which is a 7-bit character set. But that assumes you were reading "characters" rather than "bytes." Binary data certainly could be any combination of 8 bits and you'd want something out of band to signal errors/eof. From paul.winalski at gmail.com Mon May 18 02:31:28 2020 From: paul.winalski at gmail.com (Paul Winalski) Date: Sun, 17 May 2020 12:31:28 -0400 Subject: [TUHS] v7 K&R C [really lexers] In-Reply-To: <202005162359.04GNxalN3783011@darkstar.fourwinds.com> References: <20200511005745.GL17035@mcvoy.com> <357EFE54-BD94-4C10-8C43-C6735BF7D317@via.net> <20200511202555.GU17035@mcvoy.com> <20200514173206.GJ20771@mcvoy.com> <20200516235308.icuQH%steffen@sdaoden.eu> <202005162359.04GNxalN3783011@darkstar.fourwinds.com> Message-ID: Regarding lex/yacc/flex/bison, I remember (ca. 1980) when DEC's compiler group first got their hands on lex and yacc. For yucks they put the BLISS grammar through yacc. It came back with an error message that the grammar was ambiguous. And it turned out that, yes, Wulf's grammar for BLISS had an obscure corner case that *was* ambiguous. That caused quite a stir. -Paul W. From dfawcus+lists-tuhs at employees.org Mon May 18 02:34:37 2020 From: dfawcus+lists-tuhs at employees.org (Derek Fawcus) Date: Sun, 17 May 2020 17:34:37 +0100 Subject: [TUHS] v7 K&R C In-Reply-To: <20200514172107.GI20771@mcvoy.com> References: <357EFE54-BD94-4C10-8C43-C6735BF7D317@via.net> <20200511202555.GU17035@mcvoy.com> <20200514172107.GI20771@mcvoy.com> Message-ID: <20200517163437.GB5127@clarinet.employees.org> On Thu, May 14, 2020 at 10:21:07AM -0700, Larry McVoy wrote: > On Wed, May 13, 2020 at 08:42:55PM -0400, John P. Linderman wrote: > > I never liked call by reference. When I was trying to understand a chunk of > > code, it was a great mental simplification to know that whatever a called > > routine did, it couldn't have an effect on the code I was trying to > > understand except through a returned value and (ghastly) global variables. That has always been my issue with the C++ references, that one could not read a piece of code in isolation, and know when a reference may be made. I guess I'd be happy with references if the syntax always required one to write '&x' when they're being created, then the called function can choose if it either wishes to use a pointer or a reference, the only difference being the syntax used to deref the reference. As to Doug's point about new arithmetic types and overloading, I recall a few years ago reading on the 9fans list about an extension there (in KenC?) which supported them in C. I've not managed to dig up the details again, maybe someone else could. As I recall it involved defining structs. > Call by value is fine for things like a single integer or whatever. When > you have some giant array, you want to pass a pointer. > > And "const" helps a lot with indicating the subroutine isn't going to > change it. However that is simply the ABI, i.e. it should be possible for a sufficiently clever compiler to implement such a call-by-value as a call-by-constish-reference. i.e. this: somefn(struct s p) {...} struct s ss; somefn(ss); in effect becomes syntax sugar for: somefn(constish ref struct s p) {...} struct s ss; somefn(&ss); Where 'constish' does not allow 'ss' to be altered even if somefn() assigns to 'p', because in that case it would do sufficient copying so as to make things work. There was a proposed MIPS ABI which stated this. The current C semantics in effect do that, but with the ABIs always having the caller make a copy and pass a reference to it, rather than allowing the callee to make a copy if/when required. DF From paul.winalski at gmail.com Mon May 18 02:38:26 2020 From: paul.winalski at gmail.com (Paul Winalski) Date: Sun, 17 May 2020 12:38:26 -0400 Subject: [TUHS] v7 K&R C In-Reply-To: <065a01d62c68$59b7d890$0d2789b0$@ronnatalie.com> References: <20200515213138.8E0F72D2D71E@macaroni.inf.ed.ac.uk> <077a01d62b08$e696bee0$b3c43ca0$@ronnatalie.com> <20200515233427.31Vab%steffen@sdaoden.eu> <5DB09C5A-F5DA-4375-AAA5-0711FC6FB1D9@ronnatalie.com> <20200516232607.nLiIx%steffen@sdaoden.eu> <065a01d62c68$59b7d890$0d2789b0$@ronnatalie.com> Message-ID: On 5/17/20, ron at ronnatalie.com wrote: >> >> so here the EOF was different and char was signed 7-bit it seems. > > That makes perfect sense if you're dealing with ASCII, which is a 7-bit > character set. > > But that assumes you were reading "characters" rather than "bytes." Binary > data certainly could be any combination of 8 bits and you'd want something > out of band to signal errors/eof. Well, the function in question is called getchar(). And although these days "byte" is synonymous with "8 bits", historically it meant "the number of bits needed to store a single character". -Paul W. From thomas.paulsen at firemail.de Mon May 18 04:03:22 2020 From: thomas.paulsen at firemail.de (Thomas Paulsen) Date: Sun, 17 May 2020 20:03:22 +0200 Subject: [TUHS] QED with Unicode support In-Reply-To: <202005171019.04HAJxHb030339@freefriends.org> References: <202005171019.04HAJxHb030339@freefriends.org> Message-ID: <6de5c52595bcbe92fd981d444d9bf450@firemail.de> to make life easier here's the URL: https://github.com/phonologus/QED --- Hello All In case this is of interest: > Date: Sat, 16 May 2020 12:18:15 +0100 > Subject: qed-archive (Github) > To: arnold at skeeve.com > > Hi Arnold > > Apologies for an un-solicited email. > > I have ported Rob's UofT QED from your qed-archive/unix-1992 to Linux > (although I suspect anything with a Unix API will work), and have updated > it to work completely in UTF8/Unicode. This includes correct handling of > unicode codepoints in regexes, subs, etc. > > It's up on Github under phonologus/QED. I thought you might want to add it > to your list of QED-s in your README, as it may be of interest to the other > three connoisseurs out there...! > > I have also included typeset pdf-s of the tutorial and manpage. The > tutorial is fascinating. > > Also, there is no clear statement about copyright, or license, of the > unix-1992 tarball. I have attributed it to Rob, and I have noted that I > scavenged it from your Github repo, but was wondering if there is any > definitive statement on ownership/authorship that I could include in my > repo? > > Best wishes From clemc at ccc.com Mon May 18 06:08:26 2020 From: clemc at ccc.com (Clem Cole) Date: Sun, 17 May 2020 16:08:26 -0400 Subject: [TUHS] v7 K&R C In-Reply-To: References: <20200515213138.8E0F72D2D71E@macaroni.inf.ed.ac.uk> <077a01d62b08$e696bee0$b3c43ca0$@ronnatalie.com> <20200515233427.31Vab%steffen@sdaoden.eu> <5DB09C5A-F5DA-4375-AAA5-0711FC6FB1D9@ronnatalie.com> <20200516232607.nLiIx%steffen@sdaoden.eu> <065a01d62c68$59b7d890$0d2789b0$@ronnatalie.com> Message-ID: On Sun, May 17, 2020 at 12:38 PM Paul Winalski wrote: > Well, the function in question is called getchar(). And although > these days "byte" is synonymous with "8 bits", historically it meant > "the number of bits needed to store a single character". > Yep, I think that is the real crux of the issue. If you grew up with systems that used a 5, 6, or even a 7-bit byte; you have an appreciation of the difference. Remember, B, like BCPL, and BLISS only have a 'word' as the storage unit. But by the late 1960s, a byte had been declared (thanks to Fred Brooks shutting down Gene Amhadl's desires) at 8 bits, at least at IBM.** Of course, the issue was that ASCII was using only 7 bits to store a character. DEC was still sort of transitioning from word-oriented hardware (a lesson, Paul, you and I lived through being forgotten a few years later with Alpha); but the PDP-11, unlike the 18/36 or 12 bit systems followed IBM's lead and used the 8-bit byte and byte addressing. But that nasty 7-bit ASCII thing messed it up a little bit. When C was created (for the 8-bit byte addressed PDP-11) unlike B, Dennis introduced different types. As he says "C is quirky" and one of those quirks is that he created a "char" type, which was thus 8 bits naturally for the PDP-11, but was storing data following that 7-bit ASCII data with a bit leftover. As previously said in this discussion, to me issue is that it was called a *char,* not a *byte*. But I wonder if Dennis and team had had that foresight, it would have in practice made that much difference? It took many years and many lines of code and trying to encode the glyphs for many different natural languages to get to ideas like UTF. As someone else pointed out, one of the other quirks of C was trying to encode the return value of a function into single 'word.' But like many things in the world, we have to build it first and let it succeed before we can find real flaws. C was incredibly successful and as I said before, I'll not trade it for any other language yet it what it had allowed me and my peers to do over the years. I humbled by what Dennis did, I doubt many of us would have done as well. That doesn't make C perfect, or than we can not strive to do better, and maybe time will show Rust or Go to be that. But I suspect that may still be a long time in the future. All my CMU professors in the 1970s said Fortran was dead then. However .. remember that it still pays my salary and my company makes a ton of money building hardware that runs Fortran codes - it's not even close when you look at number one [check out: the application usage on one of the bigger HPC sites in Europe -- I offer it because it's easy to find the data and the graphics make it obvious what is happening: https://www.archer.ac.uk/status/codes/ - other sites have similar stats, but find them is harder]. Clem ** As my friend Russ Robeolen (who was the chief designer of the S/360 Model 50) tells the story, he says Amdahl was madder than a hornet about it, but Brooks pulled rank and kicked him out of his office. The S/360 was supposed to be an ASCII machine - Amdahl thought the extra bit for a byte was a waste -- Brooks told him if it wasn't a power of 2, don't come back -- that is "if a byte was not a power of two he did not know how to program for it efficiently and SW being efficient was more important that Amdahl's HW implementation!" (imagine that). Amdahl did get a 24-bit word type, but Brooks made him define it so that 32 bits stored everything, which again Amdahl thought was a waste of HW. Bell would later note that it was the single greatest design choice in the computer industry] . -------------- next part -------------- An HTML attachment was scrubbed... URL: From peter at rulingia.com Mon May 18 18:46:50 2020 From: peter at rulingia.com (Peter Jeremy) Date: Mon, 18 May 2020 18:46:50 +1000 Subject: [TUHS] v7 K&R C In-Reply-To: References: <20200515213138.8E0F72D2D71E@macaroni.inf.ed.ac.uk> <077a01d62b08$e696bee0$b3c43ca0$@ronnatalie.com> <20200515233427.31Vab%steffen@sdaoden.eu> <5DB09C5A-F5DA-4375-AAA5-0711FC6FB1D9@ronnatalie.com> <20200516232607.nLiIx%steffen@sdaoden.eu> <065a01d62c68$59b7d890$0d2789b0$@ronnatalie.com> Message-ID: <20200518084650.GA78465@server.rulingia.com> On 2020-May-17 16:08:26 -0400, Clem Cole wrote: >On Sun, May 17, 2020 at 12:38 PM Paul Winalski >wrote: > >> Well, the function in question is called getchar(). And although >> these days "byte" is synonymous with "8 bits", historically it meant >> "the number of bits needed to store a single character". 8-bit bytes, 32/64-bit "words" and 2's complement arithmetic have been "standard" for so long that I suspect there are a significant number of computing professionals who have never considered that there is any alternative. >Yep, I think that is the real crux of the issue. If you grew up with >systems that used a 5, 6, or even a 7-bit byte; you have an appreciation of >the difference. I've used a 36-bit system that supported 6 or 9-bit bytes. IBM Stretch even supported programmable character sizes. >DEC was still sort of transitioning from word-oriented hardware (a lesson, >Paul, you and I lived through being forgotten a few years later with >Alpha); The Alpha was byte addressed, it just didn't support byte operations on memory (at least originally). That's different to word-oriented machines that only supported word addresses. Supporting byte-wide writes at arbitrary addresses adds a chunk of complexity to the CPU/cache interface and most RISC architectures only supported word load/store operations. -- Peter Jeremy -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From dot at dotat.at Mon May 18 22:04:23 2020 From: dot at dotat.at (Tony Finch) Date: Mon, 18 May 2020 13:04:23 +0100 Subject: [TUHS] v7 K&R C In-Reply-To: References: <20200515213138.8E0F72D2D71E@macaroni.inf.ed.ac.uk> <077a01d62b08$e696bee0$b3c43ca0$@ronnatalie.com> <20200515233427.31Vab%steffen@sdaoden.eu> <5DB09C5A-F5DA-4375-AAA5-0711FC6FB1D9@ronnatalie.com> <20200516232607.nLiIx%steffen@sdaoden.eu> Message-ID: Paul Winalski wrote: > > Regarding a "mem" type, take a look at BLISS. The only data type that > language has is the machine word. BCPL and B were also word-based languages. The PDP-7 was a word-addressed machine. If I understand the history correctly, the move to NB then C was partly to make better use of the byte-addressed PDP11. Tony. -- f.anthony.n.finch http://dotat.at/ a fair, free and open society From dot at dotat.at Mon May 18 22:25:19 2020 From: dot at dotat.at (Tony Finch) Date: Mon, 18 May 2020 13:25:19 +0100 Subject: [TUHS] v7 K&R C In-Reply-To: References: Message-ID: Paul Winalski wrote: > > Why did the implementers of the Unix ABI for ARM decide to have char > be unsigned? Was there an architectural reason for it? The early ARM didn't have a sign-extended byte load instruction. I learned C with the Norcroft ARM C compiler on the Acorn Archimedes in 1991/2ish. Norcroft C had quite a lot of unix flavour despite running on a system that was not at all unixy. (I didn't get my hands on actual unix until a couple of years later.) Acorn had a BSD port to the Archimedes which I've never seen myself - the R260 was a pretty powerful system for its time which I coveted from afar. I believe the 32 bit ARM ABI evolved from the early 26 bit ABI on the Archimedes. (32 bit word, 26 bit address space.) http://chrisacorns.computinghistory.org.uk/RISCiXComputers.html More recent versions of the instruction set have more features. I believe the arm64 ABI uses signed char to match what everyone is used to. I still think unsigned bytes are more sensible, but that's what I was taught at an impressionable age... Tony. -- f.anthony.n.finch http://dotat.at/ Trafalgar: North 3 or 4, occasionally 5 later. Moderate, occasionally slight in east. Fair. Good. From clemc at ccc.com Mon May 18 23:10:33 2020 From: clemc at ccc.com (Clem Cole) Date: Mon, 18 May 2020 09:10:33 -0400 Subject: [TUHS] v7 K&R C In-Reply-To: References: <20200515213138.8E0F72D2D71E@macaroni.inf.ed.ac.uk> <077a01d62b08$e696bee0$b3c43ca0$@ronnatalie.com> <20200515233427.31Vab%steffen@sdaoden.eu> <5DB09C5A-F5DA-4375-AAA5-0711FC6FB1D9@ronnatalie.com> <20200516232607.nLiIx%steffen@sdaoden.eu> Message-ID: On Mon, May 18, 2020 at 8:05 AM Tony Finch wrote: > BCPL and B were also word-based languages. Yes, that was the style of the systems language. IIRC PL/360 worked the same way too. > The PDP-7 was a word-addressed machine. Correct. > If I understand the history correctly, the move to NB then C was > partly to make better use of the byte-addressed PDP11. > I never used NB, so you'll have to ask someone like Ken or Doug, as to when the language became 'different enough' that Dennis felt it was time to rename it. From conversations years ago with dmr, I was under the impression the original additions were considered 'syntactic sugar ' at first -- hints to help him generate better code for the PDP-11 (like 'register'). I think Steve was at Waterloo and still using B and when he returned to MH, C had appeared, but he might be able to shed some light on the transition. Clearly the byte address behavior of the 11 had a heavy influence in C. As I said in my earlier email, I've some times wonder what would have happened to the language if the data units had been: byte, word, ptr only [or if DEC marketing had not screwed up with how BLISS was released - another story for COFF I suspect]. Clem -------------- next part -------------- An HTML attachment was scrubbed... URL: From doug at cs.dartmouth.edu Mon May 18 23:58:26 2020 From: doug at cs.dartmouth.edu (Doug McIlroy) Date: Mon, 18 May 2020 09:58:26 -0400 Subject: [TUHS] v7 K&R C Message-ID: <202005181358.04IDwQ25114938@tahoe.cs.Dartmouth.EDU> > [A]lthough these days "byte" is synonymous with "8 bits", historically it > meant "the number of bits needed to store a single character". It depends upon what you mean by "historically". Originally "byte" was coined to refer to 8 bit addressable units on the IBM 7030 "Stretch" computer. The term was perpetuated for the 360 family of computers. Only later did people begin to attribute the meaning to non-addressable 6- or 9-bit units on 36- and 18-bit machines. Viewed over history, the latter usage was transient and colloquial. Doug From doug at cs.dartmouth.edu Tue May 19 00:33:25 2020 From: doug at cs.dartmouth.edu (Doug McIlroy) Date: Mon, 18 May 2020 10:33:25 -0400 Subject: [TUHS] v7 K&R C Message-ID: <202005181433.04IEXPje124241@tahoe.cs.Dartmouth.EDU> They are flagging non-portable usage. From rdm at cfcl.com Tue May 19 01:13:32 2020 From: rdm at cfcl.com (Rich Morin) Date: Mon, 18 May 2020 08:13:32 -0700 Subject: [TUHS] v7 K&R C In-Reply-To: References: <20200515213138.8E0F72D2D71E@macaroni.inf.ed.ac.uk> <077a01d62b08$e696bee0$b3c43ca0$@ronnatalie.com> <20200515233427.31Vab%steffen@sdaoden.eu> <5DB09C5A-F5DA-4375-AAA5-0711FC6FB1D9@ronnatalie.com> <20200516232607.nLiIx%steffen@sdaoden.eu> Message-ID: <198EF603-9E5F-4603-8B6E-C2019423715E@cfcl.com> > On May 17, 2020, at 09:24, Paul Winalski wrote: > > ... The PDP-10 addressed memory in 36-bit units. The character manipulating > instructions could deal with a variety of different byte lengths: you could > store ... five ASCII 7-bit characters (with a bit left over) ... IIRC, this format was called 5/7 IOPS ASCII. The PDP-7, 9, and 15 computers used a variant of this format, but they had to start with a pair of (18-bit) words. Around 1970, I wrote a pair of (assembly language) routines to extract and insert characters, because our PDP-15 did NOT have character manipulating instructions. -r From brantley at coraid.com Tue May 19 01:51:42 2020 From: brantley at coraid.com (Brantley Coile) Date: Mon, 18 May 2020 15:51:42 +0000 Subject: [TUHS] v7 K&R C In-Reply-To: <198EF603-9E5F-4603-8B6E-C2019423715E@cfcl.com> References: <20200515213138.8E0F72D2D71E@macaroni.inf.ed.ac.uk> <077a01d62b08$e696bee0$b3c43ca0$@ronnatalie.com> <20200515233427.31Vab%steffen@sdaoden.eu> <5DB09C5A-F5DA-4375-AAA5-0711FC6FB1D9@ronnatalie.com> <20200516232607.nLiIx%steffen@sdaoden.eu> <198EF603-9E5F-4603-8B6E-C2019423715E@cfcl.com> Message-ID: <5C93338C-5CA1-48D5-8F99-39AE17C45EFF@coraid.com> CDC NOS on the 6600 (I used a 70/74, actually) used 12 bits to store ASCII. Mostly, we used six bit display code. Since there was a printable character for every code, you could just look at the binary files. Ten characters to the word! Brantley > On May 18, 2020, at 11:13 AM, Rich Morin wrote: > >> On May 17, 2020, at 09:24, Paul Winalski wrote: >> >> ... The PDP-10 addressed memory in 36-bit units. The character manipulating >> instructions could deal with a variety of different byte lengths: you could >> store ... five ASCII 7-bit characters (with a bit left over) ... > > IIRC, this format was called 5/7 IOPS ASCII. The PDP-7, 9, and 15 computers used > a variant of this format, but they had to start with a pair of (18-bit) words. > Around 1970, I wrote a pair of (assembly language) routines to extract and insert > characters, because our PDP-15 did NOT have character manipulating instructions. > > -r > From crossd at gmail.com Tue May 19 02:11:00 2020 From: crossd at gmail.com (Dan Cross) Date: Mon, 18 May 2020 12:11:00 -0400 Subject: [TUHS] v7 K&R C In-Reply-To: References: <20200515213138.8E0F72D2D71E@macaroni.inf.ed.ac.uk> <077a01d62b08$e696bee0$b3c43ca0$@ronnatalie.com> <20200515233427.31Vab%steffen@sdaoden.eu> <5DB09C5A-F5DA-4375-AAA5-0711FC6FB1D9@ronnatalie.com> <20200516232607.nLiIx%steffen@sdaoden.eu> Message-ID: On Sun, May 17, 2020 at 12:24 PM Paul Winalski wrote: > On 5/16/20, Steffen Nurpmeso wrote: > > > > Why was there no byte or "mem" type? > > These days machine architecture has settled on the 8-bit byte as the > unit for addressing, but it wasn't always the case. The PDP-10 > addressed memory in 36-bit units. The character manipulating > instructions could deal with a variety of different byte lengths: you > could store six 6-bit BCD characters per machine word, Was this perhaps a typo for 9 4-bit BCD digits? I have heard that a reason for the 36-bit word size of computers of that era was that the main competition at the time was against mechanical calculator, which had 9-digit precision. 9*4=36, so 9 BCD digits could fit into a single word, for parity with the competition. 6x6-bit data would certainly hold BAUDOT data, and I thought the Univac/CDC machines supported a 6-bit character set? Does this live on in the Unisys 1100-series machines? I see some reference to FIELDATA online. I feel like this might be drifting into COFF territory now; Cc'ing there. or five ASCII > 7-bit characters (with a bit left over), or four 8-bit characters > (ASCII plus parity, with four bits left over), or four 9-bit > characters. > > Regarding a "mem" type, take a look at BLISS. The only data type that > language has is the machine word. > > > +getfield(buf) > > +char buf[]; > > +{ > > + int j; > > + char c; > > + > > + j = 0; > > + while((c = buf[j] = getc(iobuf)) >= 0) > > + if(c==':' || c=='\n') { > > + buf[j] =0; > > + return(1); > > + } else > > + j++; > > + return(0); > > +} > > > > so here the EOF was different and char was signed 7-bit it seems. > > That makes perfect sense if you're dealing with ASCII, which is a > 7-bit character set. To bring it back slightly to Unix, when Mary Ann and I were playing around with First Edition on the emulated PDP-7 at LCM+L during the Unix50 event last USENIX, I have a vague recollection that the B routine for reading a character from stdin was either `getchar` or `getc`. I had some impression that this did some magic necessary to extract a character from half of an 18-bit word (maybe it just zeroed the upper half of a word or something). If I had to guess, I imagine that the coincidence between "character" and "byte" in C is a quirk of this history, as opposed to any special hidden meaning regarding textual vs binary data, particularly since Unix makes no real distinction between the two: files are just unstructured bags of bytes, they're called 'char' because that was just the way things had always been. - Dan C. -------------- next part -------------- An HTML attachment was scrubbed... URL: From pnr at planet.nl Tue May 19 07:01:52 2020 From: pnr at planet.nl (Paul Ruizendaal) Date: Mon, 18 May 2020 23:01:52 +0200 Subject: [TUHS] Chaos networking in 8th edition (and in 7th) Message-ID: The surviving 8th edition source has code for Chaos networking included: https://www.tuhs.org/cgi-bin/utree.pl?file=V8/usr/sys It does not appear to be included in the man pages. Was Chaos networking in use at the labs, or is it just an artifact present on the surviving tape? Related to that, I’m interested in the Chaosnet implementation for 7th edition. Dave Moon's Chaosnet memo includes this intriguing sentence when talking about the V7 implementation: “The NCP is entirely implemented in the kernel as a device driver”. I could not find that source code in the TUHS archive, nor on Kirk McKusick’s DVD. Does anybody happen to have it? -------------- next part -------------- An HTML attachment was scrubbed... URL: From ron at ronnatalie.com Tue May 19 07:18:34 2020 From: ron at ronnatalie.com (ron at ronnatalie.com) Date: Mon, 18 May 2020 17:18:34 -0400 Subject: [TUHS] v7 K&R C In-Reply-To: References: <20200515213138.8E0F72D2D71E@macaroni.inf.ed.ac.uk> <077a01d62b08$e696bee0$b3c43ca0$@ronnatalie.com> <20200515233427.31Vab%steffen@sdaoden.eu> <5DB09C5A-F5DA-4375-AAA5-0711FC6FB1D9@ronnatalie.com> <20200516232607.nLiIx%steffen@sdaoden.eu> Message-ID: <094d01d62d59$e4192240$ac4b66c0$@ronnatalie.com> No typo. While BCD was a way of encoding digits, BCD was also used as a character encoding. Often these were outgrowths of the digit+zone punch encoding of IBM cards. IBM later extended their BCD making into…. The EXTENDED Binary Coded Decimal Interchange Code, going from 6 to 8 bits in the process.l UNIVAC indeed have their own BCD-sih format called FIELDDATA. It was notable in that the nul value printed as @. the PDP-10 and the UNVAC 1100 series were just the longest surviving perhaps of the 36 bit computers, that also included the IBM 70XX series and the GE 600 (Honewell 6000) series. Both the UNIVAC and the PDP-10 did have the nice variable partial word mode, but all of these were indeed word addressed machines. The early Crays also were word addressed. The C compiler would simulated byte addressing by putting the byte offsetinto the word in the high order bits (the address resgisters themselves were pinned at 24 bits). Just to get this back on a UNIX history track, let me delve into more trivia. Perhaps the real oddity was the Denelcor HEP. The HEP had two addressing modes: One was byte addressed (as you expect), the other was for all other data thpes (16-bit, 32-bit, and 64-bit portions of the 64-bit word). The lower 3 bits of the memory address encoded the word side. If it was 0 or 4, then it a 64 bit operand at the address specified in the higher part of the pointer. If it was 2 or 6, then it was either the upper or lower half word. If it was 1,3,5, or 7, it was one of the respective quarter words. This caused a problem when we ported 4BSD to the thing. The Berkeley kernel (particularly in the I/O code) did what I called “conversion by union.” They would store a value in a union using one type pointer and the retrieve it via a different type. In our compiler, had they used a cast (or went via a void* intermediary), everything would be fine. But doing this sort of shenanigan (which is technically undefined behavior in C) led to insidious bugs where you’d be doing pointer operations and the WRONG size word would be referenced. I spent a few days hunting all these down and fixing them. It was about this time I realized that the code was setting up I/Os using a feature aptly named “The Low Speed Bus” and that we’d never get any reasonable performance that way. HEP designer Burton Smith and I redesigned the I/O system literally on napkins from the Golden Corral in Aberdeen. We went back and built a new I/O system out of spare parts we had on hand using an 11/34 as a control processor. The HEP I/O system was kind of interesting in that while it had a highspeed interface into he HEP’s ECL memory, the thing consisted of 32 individual DEC UNIBUSes. -------------- next part -------------- An HTML attachment was scrubbed... URL: From norman at oclsc.org Tue May 19 07:52:38 2020 From: norman at oclsc.org (Norman Wilson) Date: Mon, 18 May 2020 17:52:38 -0400 Subject: [TUHS] Chaos networking in 8th edition (and in 7th) Message-ID: <1589838762.16960.for-standards-violators@oclsc.org> Paul Ruizendaal: Was Chaos networking in use at the labs, or is it just an artifact present on the surviving tape? === I don't recall any use of Chaos in 1127. Possibly one of the nearby groups who also used the Research system needed it at some point, perhaps before my time (I arrived in August 1984). I certainly don't remember anyone raising objections to discarding it. Norman Wilson Toronto ON From doug at cs.dartmouth.edu Tue May 19 12:29:34 2020 From: doug at cs.dartmouth.edu (Doug McIlroy) Date: Mon, 18 May 2020 22:29:34 -0400 Subject: [TUHS] v7 K&R C Message-ID: <202005190229.04J2TYgL125193@tahoe.cs.Dartmouth.EDU> I should have checked my 7030 manual before asserting that the 8-bit byte came from there. The term did, but it meant an addressable unit of 1 to 8 bits depending on the instruction being executed. [The machine was addressable to the bit. It also had all 16 bitwise logical operators, and maintained counts of the 1 bits and leading 0 bits in a register. And it was BIG. I saw one with 17 memory boxes (each essentially identical with the total memory of a 7090) stretched across the immaculate hardwood floor of IBM's Poughkeepsie plant.] Doug From usotsuki at buric.co Tue May 19 13:20:34 2020 From: usotsuki at buric.co (Steve Nickolas) Date: Mon, 18 May 2020 23:20:34 -0400 (EDT) Subject: [TUHS] v7 K&R C In-Reply-To: <202005190229.04J2TYgL125193@tahoe.cs.Dartmouth.EDU> References: <202005190229.04J2TYgL125193@tahoe.cs.Dartmouth.EDU> Message-ID: On Mon, 18 May 2020, Doug McIlroy wrote: > I should have checked my 7030 manual before asserting > that the 8-bit byte came from there. The term did, > but it meant an addressable unit of 1 to 8 bits > depending on the instruction being executed. > > [The machine was addressable to the bit. It also > had all 16 bitwise logical operators, and > maintained counts of the 1 bits and leading > 0 bits in a register. And it was BIG. I saw > one with 17 memory boxes (each essentially > identical with the total memory of a 7090) > stretched across the immaculate hardwood > floor of IBM's Poughkeepsie plant.] > > Doug > I used to go by that plant all the time, because I lived in Staatsburg two towns north of Poughkeepsie for 3 years. -uso. From grog at lemis.com Tue May 19 13:24:09 2020 From: grog at lemis.com (Greg 'groggy' Lehey) Date: Tue, 19 May 2020 13:24:09 +1000 Subject: [TUHS] IBM 7030 byte size (was: v7 K&R C) In-Reply-To: <202005181358.04IDwQ25114938@tahoe.cs.Dartmouth.EDU> References: <202005181358.04IDwQ25114938@tahoe.cs.Dartmouth.EDU> Message-ID: <20200519032409.GJ1670@eureka.lemis.com> On Monday, 18 May 2020 at 9:58:26 -0400, Doug McIlroy wrote: >> [A]lthough these days "byte" is synonymous with "8 bits", historically it >> meant "the number of bits needed to store a single character". > > It depends upon what you mean by "historically". Originally "byte" > was coined to refer to 8 bit addressable units on the IBM 7030 > "Stretch" computer. It seems that even then it was of variable size. From G.R. Trimble, "STRETCH," Computer Usage Communique, 1963, (http://archive.computerhistory.org/resources/text/Computer_Usage_Company/cuc.communique_vol2no3.1963.102651922.pdf): the words can be composed of "bytes" with from one to eight bits in a byte. There's more at https://people.cs.clemson.edu/~mark/stretch.html. > The term was perpetuated for the 360 family of computers. Only > later did people begin to attribute the meaning to non-addressable > 6- or 9-bit units on 36- and 18-bit machines. > > Viewed over history, the latter usage was transient and colloquial Transient maybe, but UNIVAC used the term in its documentation of the 1100 series. The 1106/1108/1110 could access (but not directly address) 6, 9 and 12 bit "bytes". Greg -- Sent from my desktop computer. Finger grog at lemis.com for PGP public key. See complete headers for address and phone numbers. This message is digitally signed. If your Microsoft mail program reports problems, please read http://lemis.com/broken-MUA -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 163 bytes Desc: not available URL: From lars at nocrew.org Tue May 19 13:41:29 2020 From: lars at nocrew.org (Lars Brinkhoff) Date: Tue, 19 May 2020 03:41:29 +0000 Subject: [TUHS] Chaos networking in 8th edition (and in 7th) In-Reply-To: <1589838762.16960.for-standards-violators@oclsc.org> (Norman Wilson's message of "Mon, 18 May 2020 17:52:38 -0400") References: <1589838762.16960.for-standards-violators@oclsc.org> Message-ID: <7wimgslhg6.fsf@junk.nocrew.org> Norman Wilson wrote: > Paul Ruizendaal wrote: >> Was Chaos networking in use at the labs, or is it just an artifact >> present on the surviving tape? > > I don't recall any use of Chaos in 1127. Possibly one of > the nearby groups who also used the Research system needed > it at some point Speculating wildly, maybe there was a Lisp machine somewhere? From imp at bsdimp.com Tue May 19 15:28:35 2020 From: imp at bsdimp.com (Warner Losh) Date: Mon, 18 May 2020 23:28:35 -0600 Subject: [TUHS] Status of Space Travel Message-ID: What's the current state of efforts to get space travel working again? Warner -------------- next part -------------- An HTML attachment was scrubbed... URL: From lars at nocrew.org Tue May 19 15:56:25 2020 From: lars at nocrew.org (Lars Brinkhoff) Date: Tue, 19 May 2020 05:56:25 +0000 Subject: [TUHS] Status of Space Travel In-Reply-To: (Warner Losh's message of "Mon, 18 May 2020 23:28:35 -0600") References: Message-ID: <7wa724lb7a.fsf@junk.nocrew.org> Warner Losh wrote: > What's the current state of efforts to get space travel working again? Somewhat stalled. I wrote an emulator for the Graphic II and it seems to work with the pool game and tic-tac-toe. Space Travel has some problem running and will not display anything. From sebras at gmail.com Tue May 19 15:57:41 2020 From: sebras at gmail.com (Sebastian Rasmussen) Date: Tue, 19 May 2020 13:57:41 +0800 Subject: [TUHS] Status of Space Travel In-Reply-To: References: Message-ID: > What's the current state of efforts to get space travel working again? The assembler source listing has been transcribed two times to weed out any transcription errors. The same goes for the accompanying floating point library. It currently starts and draws a screen for a brief second in the emulator before crashing back to the prompt, so something is amiss. / Sebastian From sebras at gmail.com Tue May 19 16:34:57 2020 From: sebras at gmail.com (Sebastian Rasmussen) Date: Tue, 19 May 2020 14:34:57 +0800 Subject: [TUHS] Status of Space Travel In-Reply-To: References: Message-ID: > It currently starts and draws a screen for a brief second in the emulator before > crashing back to the prompt, so something is amiss. This is is a screenshot of what I see on the emulated GRAPHIC-II screen for a brief second. It is slightly dark due to the emulated phosphor decay: http://sebras.se/space-travel.png / Sebastian From dave at horsfall.org Tue May 19 17:41:31 2020 From: dave at horsfall.org (Dave Horsfall) Date: Tue, 19 May 2020 17:41:31 +1000 (EST) Subject: [TUHS] v7 K&R C In-Reply-To: <20200518084650.GA78465@server.rulingia.com> References: <20200515213138.8E0F72D2D71E@macaroni.inf.ed.ac.uk> <077a01d62b08$e696bee0$b3c43ca0$@ronnatalie.com> <20200515233427.31Vab%steffen@sdaoden.eu> <5DB09C5A-F5DA-4375-AAA5-0711FC6FB1D9@ronnatalie.com> <20200516232607.nLiIx%steffen@sdaoden.eu> <065a01d62c68$59b7d890$0d2789b0$@ronnatalie.com> <20200518084650.GA78465@server.rulingia.com> Message-ID: On Mon, 18 May 2020, Peter Jeremy wrote: > 8-bit bytes, 32/64-bit "words" and 2's complement arithmetic have been > "standard" for so long that I suspect there are a significant number of > computing professionals who have never considered that there is any > alternative. You haven't lived until you've dealt with a 1's-complement machine i.e. -0 != 0 ... To be fair, it was *mostly* normalised. >> Yep, I think that is the real crux of the issue. If you grew up with >> systems that used a 5, 6, or even a 7-bit byte; you have an >> appreciation of the difference. > > I've used a 36-bit system that supported 6 or 9-bit bytes. IBM Stretch > even supported programmable character sizes. Ever tried a Univac or a Honeywell? I don't remember the exact details, and I prefer to keep it that way... > The Alpha was byte addressed, it just didn't support byte operations on > memory (at least originally). That's different to word-oriented > machines that only supported word addresses. Supporting byte-wide > writes at arbitrary addresses adds a chunk of complexity to the > CPU/cache interface and most RISC architectures only supported word > load/store operations. I had to support an old Alpha once; that was one of the reasons why I was happy to leave the joint. We had just one customer who used an Alpha, and thus we/I had to maintain the thing. And don't even ask me about HP-UX (just as well that they weren't called Packard-Hewlett), nor Xenix, nor early Slowaris, nor National Cash Registers, nor... Excuse me, I now have to take my sleepy pills :-) -- Dave From jnc at mercury.lcs.mit.edu Tue May 19 22:29:39 2020 From: jnc at mercury.lcs.mit.edu (Noel Chiappa) Date: Tue, 19 May 2020 08:29:39 -0400 (EDT) Subject: [TUHS] v7 K&R C Message-ID: <20200519122939.07AE618C086@mercury.lcs.mit.edu> There was a recent message I now can't find that I wanted to reply to, something about which type to use to get a certain effect. I wanted to reply to say that I felt that it was not really the best way to go, to have one set of type names that tried to denote both i) the semantics of the data, and ii) the size of the item, using arbitrary names. This came up for me when we started to try and write portable networking code. There, you need to be able to specify very precisely how long fields are (well, in lower-level protocols, which use non-printable formats). How to do that in a way that was portable, in the compilers of the day, was a real struggle. (It might be doable now, but I think the fixes that allowed it were still just patches to something that had gone in the wrong direction, above.) I created a series of macros for type definitions, ones that separately and explicitly specified the semantics and size. They looked like 'xxxy', where 'xxx' was the semantics (signed and unsigned integers, bit field, etc), and 'y' was a length indication (byte, short, long, and others). So you'd see things like 'unsb' and 'intl'. The interesting twist was a couple of unusual length specifiers; among them, 'w' stood for 'the machine's natural word length', and 'f' meant 'no particular length, just whatever's fastest on this architecture/compiler, and at least 16 bits'. The former was useful in OSy type code; the latter for locals and things where nobody outside the machine would see them. Then you'd have to have a file of macro definitions (only one per machine) which translated them all into the local architecture/compiler - some didn't go, of course (no 'unsb' on a PDP-11), but it all worked really, really well, for many years. E.g. at one point, as a dare/hack, I said I'd move the MOS operating system, a version written in portable C (with that type name system) to the AMD 29000 over one night. This wasn't totaly crazy; I'd already gotten the debugger (a DDT written in similar portable C) to run on the machine, so I knew where the potholes were. I'd have to write a small amount of machine language (which I could traslate from the M68K version), but most of it should just compile and go. I didn't quite make it, it wasn't quite running when people started coming in the next morning; but IIRC it started to work later that day. Noel From dave at horsfall.org Tue May 19 23:12:02 2020 From: dave at horsfall.org (Dave Horsfall) Date: Tue, 19 May 2020 23:12:02 +1000 (EST) Subject: [TUHS] Status of Space Travel In-Reply-To: References: Message-ID: On Mon, 18 May 2020, Warner Losh wrote: > What's the current state of efforts to get space travel working again? Does Lunar Lander on ye olde GT-40 count? I've since lost the source, but I'm hoping that Andrew Hume still has it somewhere. -- Dave From lars at nocrew.org Tue May 19 23:58:22 2020 From: lars at nocrew.org (Lars Brinkhoff) Date: Tue, 19 May 2020 13:58:22 +0000 Subject: [TUHS] Status of Space Travel In-Reply-To: (Dave Horsfall's message of "Tue, 19 May 2020 23:12:02 +1000 (EST)") References: Message-ID: <7wo8qkjabl.fsf@junk.nocrew.org> Dave Horsfall wrote: >> What's the current state of efforts to get space travel working again? > Does Lunar Lander on ye olde GT-40 count? I've since lost the source, > but I'm hoping that Andrew Hume still has it somewhere. In the context of reviving Space Travel, I don't think it counts. Source code here: http://www.brouhaha.com/~eric/retrocomputing/dec/gt40/software/moonlander/gtlem.mac From clemc at ccc.com Tue May 19 23:59:39 2020 From: clemc at ccc.com (Clem Cole) Date: Tue, 19 May 2020 09:59:39 -0400 Subject: [TUHS] Status of Space Travel In-Reply-To: References: Message-ID: On Tue, May 19, 2020 at 9:13 AM Dave Horsfall wrote: > On Mon, 18 May 2020, Warner Losh wrote: > > > What's the current state of efforts to get space travel working again? > > Does Lunar Lander on ye olde GT-40 count? I've since lost the source, but > I'm hoping that Andrew Hume still has it somewhere. > https://en.wikipedia.org/wiki/Lunar_Lander_(video_game_genre) Please send me email offline if you want the sources from Jack. BTW: his 16-bit integer trig routines are still a tour-de-force of integer operations. Looking at the assembler sources to Moonlander just to learn how he did that is. Jack said he spent a day in the MIT library working it all out, went home that night, and coded it all up. FYI: I'm also still in almost daily contact with Jack who still hacking on graphics systems (and his joke mailing list is a who-who of the industry). [I've tried for years to get him to join this list, but I have never been successful.] -------------- next part -------------- An HTML attachment was scrubbed... URL: From jacob.ritorto at gmail.com Wed May 20 04:31:02 2020 From: jacob.ritorto at gmail.com (Jacob Ritorto) Date: Tue, 19 May 2020 14:31:02 -0400 Subject: [TUHS] 2.11bsd boot countdown stuck at <5> on 11/73 Message-ID: Hi, I had a disk in my 11/83 that I'd set up for autoboot. The pdp11 would load block 0, load /boot and happily count down 5 seconds, then continue on with kernel loading & regular bringing up of the system to multiuser. That was cool! So I pulled that disk out and plugged it into my other system with a dual height 11/73 (this system has same CMD CQD-200 scsi controller) and it gets to the countdown part and stops! I'm wondering why. The initial bootblock load is apparently fine and loading and running of boot is happening.. its just that the countdown stays stuck at five and I have to hit to get it to boot (and the rest of the boot sequence is totally normal). I had a suspicion that maybe the clock isn't working on the 11/73. Would that make sense? Maybe boot is actually using a clock somehow? Maybe I have to check jumpers on the 11/73 to be sure its clock is enabled? Just thought I'd float the question here in case someone remembers this oddity before I start digging / "use the source." thx jake -------------- next part -------------- An HTML attachment was scrubbed... URL: From jnc at mercury.lcs.mit.edu Wed May 20 04:50:59 2020 From: jnc at mercury.lcs.mit.edu (Noel Chiappa) Date: Tue, 19 May 2020 14:50:59 -0400 (EDT) Subject: [TUHS] 2.11bsd boot countdown stuck at <5> on 11/73 Message-ID: <20200519185059.05A1518C085@mercury.lcs.mit.edu> > From: Jacob Ritorto > I had a suspicion that maybe the clock isn't working on the 11/73. > Would that make sense? Would definitely explain the symptoms. > Maybe I have to check jumpers on the 11/73 to be sure its clock is > enabled? There is a jumper on the KFJ11-A that enables/disables the clock®ister; W9 (closest to the handles) should be removed to enable the clock. (I really need to add the jumpers to the KDJ11-A page on the Computer History Wiki, mumble....) Noel From roam at ringlet.net Wed May 20 05:45:34 2020 From: roam at ringlet.net (Peter Pentchev) Date: Tue, 19 May 2020 22:45:34 +0300 Subject: [TUHS] v7 K&R C In-Reply-To: References: <202005141841.04EIfvEZ063529@tahoe.cs.Dartmouth.EDU> <20200515150122.GF30160@mcvoy.com> <014001d62af3$9cc209b0$d6461d10$@ronnatalie.com> Message-ID: <20200519194534.GH140052@straylight.m.ringlet.net> On Sat, May 16, 2020 at 11:39:54AM -0600, Warner Losh wrote: > On Sat, May 16, 2020 at 10:28 AM Paul Winalski > wrote: > > > > On Fri, May 15, 2020 at 4:02 PM wrote: > > > > > >Unfortunately, if c is char on a machine with unsigned chars, or it’s of > > >type unsigned char, the EOF will never be detected. > > > > > > - while ((c = getchar()) != EOF) if (c == '\n') { /* entire record is > > now there */ > > > > The function prototype for getchar() is: int getchar(void); > > > > It returns an int, not a char. In all likelihood this is specifically > > *because* EOF is defined as -1. The above code works fine if c is an > > int. One always has to be very careful when doing a typecast of a > > function return value. > > > > In the early days of my involvement with FreeBSD, I went through and fixed > about a dozen cases where getopt was being assigned to a char and then > compared with EOF. I'm certain that this is why. Also EOF has to be a value > that's not representable by a character, or your 0xff bytes would disappear. I think I remember a code review on one of my patches to du(1), I think, something about adding an option to ignore specific names when recursing, and I remember either you or BDE chastising me about while (ch = getopt(...), ch != EOF) :) G'luck, Peter -- Peter Pentchev roam at ringlet.net roam at debian.org pp at storpool.com PGP key: http://people.FreeBSD.org/~roam/roam.key.asc Key fingerprint 2EE7 A7A5 17FC 124C F115 C354 651E EFB0 2527 DF13 -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: From ron at ronnatalie.com Wed May 20 08:27:31 2020 From: ron at ronnatalie.com (ron at ronnatalie.com) Date: Tue, 19 May 2020 18:27:31 -0400 Subject: [TUHS] PiDP Message-ID: <03fc01d62e2c$b069f0e0$113dd2a0$@ronnatalie.com> I bought one of Oscar's kits a while back and I just got around to putting it together. It was kind of neat booting up a straight V6 (it's been a long time) and playing with those DEC operating systems that I had learned to hate (Really Sh-ty Timesharing System). I had hoped that someone, somewhere had a JHU/BRL dist around (well if you do, I'd love to get it). But since the thing is mostly just a decoration in my office, I just hacked the idle function in 2.11 BSD to mimic the one that JHU Unix had. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jqcoffey at gmail.com Wed May 20 09:18:23 2020 From: jqcoffey at gmail.com (Justin Coffey) Date: Tue, 19 May 2020 16:18:23 -0700 Subject: [TUHS] PiDP In-Reply-To: <03fc01d62e2c$b069f0e0$113dd2a0$@ronnatalie.com> References: <03fc01d62e2c$b069f0e0$113dd2a0$@ronnatalie.com> Message-ID: Warning!!! Snarky/useless comment as my first-post! > On May 19, 2020, at 3:27 PM, wrote: > It was kind of neat booting up a straight V6 (it’s been a long time) Ok, so it's /either/ a V6 or a straight-6. Can't have both :). -Justin -------------- next part -------------- An HTML attachment was scrubbed... URL: From jacob.ritorto at gmail.com Wed May 20 11:20:20 2020 From: jacob.ritorto at gmail.com (Jacob Ritorto) Date: Tue, 19 May 2020 21:20:20 -0400 Subject: [TUHS] 2.11bsd boot countdown stuck at <5> on 11/73 In-Reply-To: <20200519185059.05A1518C085@mercury.lcs.mit.edu> References: <20200519185059.05A1518C085@mercury.lcs.mit.edu> Message-ID: Shoot, already enabled. Gonna have to dive deeper. Thanks for the idea though, Noel! On Tue, May 19, 2020 at 2:50 PM Noel Chiappa wrote: > > From: Jacob Ritorto > > > I had a suspicion that maybe the clock isn't working on the 11/73. > > Would that make sense? > > Would definitely explain the symptoms. > > > Maybe I have to check jumpers on the 11/73 to be sure its clock is > > enabled? > > There is a jumper on the KFJ11-A that enables/disables the clock®ister; > W9 (closest to the handles) should be removed to enable the clock. > > (I really need to add the jumpers to the KDJ11-A page on the Computer > History > Wiki, mumble....) > > Noel > -------------- next part -------------- An HTML attachment was scrubbed... URL: From doug at cs.dartmouth.edu Wed May 20 13:29:48 2020 From: doug at cs.dartmouth.edu (Doug McIlroy) Date: Tue, 19 May 2020 23:29:48 -0400 Subject: [TUHS] Chaos networking in 8th edition (and in 7th) Message-ID: <202005200329.04K3TmBF006901@tahoe.cs.Dartmouth.EDU> >> I don't recall any use of Chaos in 1127. Possibly one of >> the nearby groups who also used the Research system needed >> it at some point > Speculating wildly, maybe there was a Lisp machine somewhere? None that I can remember. Doug From rdm at cfcl.com Wed May 20 13:52:17 2020 From: rdm at cfcl.com (Rich Morin) Date: Tue, 19 May 2020 20:52:17 -0700 Subject: [TUHS] v7 K&R C In-Reply-To: <20200519194534.GH140052@straylight.m.ringlet.net> References: <202005141841.04EIfvEZ063529@tahoe.cs.Dartmouth.EDU> <20200515150122.GF30160@mcvoy.com> <014001d62af3$9cc209b0$d6461d10$@ronnatalie.com> <20200519194534.GH140052@straylight.m.ringlet.net> Message-ID: <1CB35E55-B577-491F-B293-A4A0686FFCDC@cfcl.com> On a vaguely-related note, for some years I've been using the phrase "algebraic syntax" to characterize languages such as Algol, C/C++, Fortran, Java(Script), Ruby, etc. Contrary examples might include Assembler, COBOL, Forth, Lisp, RPG, etc. However, I can't find this usage in Wikipedia or elsewhere in the Intertubes. Am I simply confused? Is there a better term to use? Inquiring gnomes need to mine... -r From robpike at gmail.com Wed May 20 16:06:23 2020 From: robpike at gmail.com (Rob Pike) Date: Wed, 20 May 2020 16:06:23 +1000 Subject: [TUHS] Chaos networking in 8th edition (and in 7th) In-Reply-To: <202005200329.04K3TmBF006901@tahoe.cs.Dartmouth.EDU> References: <202005200329.04K3TmBF006901@tahoe.cs.Dartmouth.EDU> Message-ID: I have vague memories of one in 1135 or 1138. Could just be thinking of the Symbolics poster by Bart's desk, though, the one advertising Emacs's "over 400 easy to use commands". -rob On Wed, May 20, 2020 at 1:30 PM Doug McIlroy wrote: > >> I don't recall any use of Chaos in 1127. Possibly one of > >> the nearby groups who also used the Research system needed > >> it at some point > > > Speculating wildly, maybe there was a Lisp machine somewhere? > > None that I can remember. > > Doug > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ches at cheswick.com Wed May 20 21:43:51 2020 From: ches at cheswick.com (William Cheswick) Date: Wed, 20 May 2020 07:43:51 -0400 Subject: [TUHS] Chaos networking in 8th edition (and in 7th) In-Reply-To: References: <202005200329.04K3TmBF006901@tahoe.cs.Dartmouth.EDU> Message-ID: <33D54554-9309-4D82-9ACA-9B090401C1C5@cheswick.com> Lillian Schwartz used a symbolics, and has missed it in recent years. > On May 20, 2020, at 2:06 AM, Rob Pike wrote: > > I have vague memories of one in 1135 or 1138. Could just be thinking of the Symbolics poster by Bart's desk, though, the one advertising Emacs's "over 400 easy to use commands". > > -rob > > > On Wed, May 20, 2020 at 1:30 PM Doug McIlroy wrote: > >> I don't recall any use of Chaos in 1127. Possibly one of > >> the nearby groups who also used the Research system needed > >> it at some point > > > Speculating wildly, maybe there was a Lisp machine somewhere? > > None that I can remember. > > Doug From jpl.jpl at gmail.com Wed May 20 22:30:06 2020 From: jpl.jpl at gmail.com (John P. Linderman) Date: Wed, 20 May 2020 08:30:06 -0400 Subject: [TUHS] Chaos networking in 8th edition (and in 7th) In-Reply-To: <33D54554-9309-4D82-9ACA-9B090401C1C5@cheswick.com> References: <202005200329.04K3TmBF006901@tahoe.cs.Dartmouth.EDU> <33D54554-9309-4D82-9ACA-9B090401C1C5@cheswick.com> Message-ID: Who had the symbolics Lillian used? I know it wasn't the MH wing of 113. On Wed, May 20, 2020 at 7:50 AM William Cheswick wrote: > Lillian Schwartz used a symbolics, and has missed it in recent years. > > > On May 20, 2020, at 2:06 AM, Rob Pike wrote: > > > > I have vague memories of one in 1135 or 1138. Could just be thinking of > the Symbolics poster by Bart's desk, though, the one advertising Emacs's > "over 400 easy to use commands". > > > > -rob > > > > > > On Wed, May 20, 2020 at 1:30 PM Doug McIlroy > wrote: > > >> I don't recall any use of Chaos in 1127. Possibly one of > > >> the nearby groups who also used the Research system needed > > >> it at some point > > > > > Speculating wildly, maybe there was a Lisp machine somewhere? > > > > None that I can remember. > > > > Doug > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dave at horsfall.org Fri May 22 01:06:01 2020 From: dave at horsfall.org (Dave Horsfall) Date: Fri, 22 May 2020 01:06:01 +1000 (EST) Subject: [TUHS] v7 K&R C In-Reply-To: <1CB35E55-B577-491F-B293-A4A0686FFCDC@cfcl.com> References: <202005141841.04EIfvEZ063529@tahoe.cs.Dartmouth.EDU> <20200515150122.GF30160@mcvoy.com> <014001d62af3$9cc209b0$d6461d10$@ronnatalie.com> <20200519194534.GH140052@straylight.m.ringlet.net> <1CB35E55-B577-491F-B293-A4A0686FFCDC@cfcl.com> Message-ID: On Tue, 19 May 2020, Rich Morin wrote: > On a vaguely-related note, for some years I've been using the phrase > "algebraic syntax" to characterize languages such as Algol, C/C++, > Fortran, Java(Script), Ruby, etc. Contrary examples might include > Assembler, COBOL, Forth, Lisp, RPG, etc. My benchmark is "Can it be described in BNF?" LISP, for example, would be something like: phrase: "(" phrase ")" > However, I can't find this usage in Wikipedia or elsewhere in the > Intertubes. Am I simply confused? Is there a better term to use? > Inquiring gnomes need to mine... You are confused because you are relying upon Wikipedia :-) Well, someone had to say it, so it may as well be me; as I keep saying, it's only as accurate as the last idiot who updated it. -- Dave, a Wikipedia editor From doug at cs.dartmouth.edu Fri May 22 01:12:02 2020 From: doug at cs.dartmouth.edu (Doug McIlroy) Date: Thu, 21 May 2020 11:12:02 -0400 Subject: [TUHS] algebraic syntax (was v7 K&R C) Message-ID: <202005211512.04LFC2e6006407@tahoe.cs.Dartmouth.EDU> > for some years I've been using the phrase "algebraic syntax" to > characterize languages such as Algol, C/C++, Fortran, Java(Script), Ruby, etc. ... > However, I can't find this usage in Wikipedia or elsewhere I think "Algol-like" is the closest term in common use, though it doesn't have the exact connotation that I think you intend. Nowadays, I think of languages like Haskell as being "algebraic" in the deeper sense of having taken much inspiration from modern algebra, and being preeminently suitable for application to algebraic domains. Vic Vyssotsky used the term "narrative language" as quite a close synonym of your "algebraic syntax", but I think his usage was equally idiosyncratic. Doug From dave at horsfall.org Fri May 22 01:18:39 2020 From: dave at horsfall.org (Dave Horsfall) Date: Fri, 22 May 2020 01:18:39 +1000 (EST) Subject: [TUHS] Chaos networking in 8th edition (and in 7th) In-Reply-To: References: <202005200329.04K3TmBF006901@tahoe.cs.Dartmouth.EDU> Message-ID: On Wed, 20 May 2020, Rob Pike wrote: > I have vague memories of one in 1135 or 1138. Could just be thinking of > the Symbolics poster by Bart's desk, though, the one advertising > Emacs's "over 400 easy to use commands". My EMACS gatherings: EMACS - eight megs and constantly swapping "Enough Memory? A Concept Strange!" I thought it stood for Escape-Meta-Alt-Control-Shift Emacs Makes A Computer Slow Eventually Munches All Computer Storage More contributions welcome... I'll put 'em on my pitiful excuse for a web page some day. -- Dave, a committed VI user (and probably who ought to be committed) From coppero1237 at gmail.com Fri May 22 01:27:26 2020 From: coppero1237 at gmail.com (Tyler Adams) Date: Thu, 21 May 2020 18:27:26 +0300 Subject: [TUHS] History of popularity of C Message-ID: Does anybody have any good resources on the history of the popularity of C? I'm looking for data to resolve a claim that C is so prolific and influential because it's so easy to write a C compiler. Tyler -------------- next part -------------- An HTML attachment was scrubbed... URL: From toby at telegraphics.com.au Fri May 22 02:10:35 2020 From: toby at telegraphics.com.au (Toby Thain) Date: Thu, 21 May 2020 12:10:35 -0400 Subject: [TUHS] History of popularity of C In-Reply-To: References: Message-ID: <8a2e9b1b-8890-a783-5b53-c8480c070f2e@telegraphics.com.au> On 2020-05-21 11:27 AM, Tyler Adams wrote: > Does anybody have any good resources on the history of the popularity of > C? I'm looking for data to resolve a claim that C is so prolific and > influential because it's so easy to write a C compiler. > > Tyler Based on recollections of C from mid-1980s until today, this claim doesn't make sense for several reasons. Sorry, this is all anecdata or recollection, not cited data: - inexpensive compiler availability was not very good until ~1990 or later, but C had been taking off like wildfire for 10 years before that - developing good compilers is certainly not "easy" - and there were a lot of mediocre vendor compilers despite (duplicated) investment - by the time gcc was mature (by some definition, but probably before 1990) - something that happened largely as a reaction to the vendor compiler situation - it was a large and complicated codebase even by standards of the time - hobby/novelty/small/educational compilers are a relatively new thing and arrived long after the C adoption curve was complete. The earliest well known example I can think of is lcc (1994) but most are much newer. ...and probably quite a few other points. --T From jcapp at anteil.com Fri May 22 02:18:56 2020 From: jcapp at anteil.com (Jim Capp) Date: Thu, 21 May 2020 12:18:56 -0400 (EDT) Subject: [TUHS] History of popularity of C In-Reply-To: Message-ID: <11563272.2136.1590077936597.JavaMail.root@zimbraanteil> Again, based on recollections, what got me immediately interested was that I regarded C as a "portable assembler". It was one of the earliest implementations of "write-once-run-anywhere". From: "Tyler Adams" To: "The Eunuchs Hysterical Society" Sent: Thursday, May 21, 2020 11:27:26 AM Subject: [TUHS] History of popularity of C Does anybody have any good resources on the history of the popularity of C? I'm looking for data to resolve a claim that C is so prolific and influential because it's so easy to write a C compiler. Tyler -------------- next part -------------- An HTML attachment was scrubbed... URL: From cym224 at gmail.com Fri May 22 02:29:53 2020 From: cym224 at gmail.com (Nemo Nusquam) Date: Thu, 21 May 2020 12:29:53 -0400 Subject: [TUHS] Chaos networking in 8th edition (and in 7th) In-Reply-To: References: <202005200329.04K3TmBF006901@tahoe.cs.Dartmouth.EDU> Message-ID: <56d4090d-a02e-6056-6202-29e32b19e678@gmail.com> On 05/21/20 11:18, Dave Horsfall wrote (in part): > More contributions welcome... I'll put 'em on my pitiful excuse for a > web page some day. This is a pathetic contribution as I cannot remember when or where but I recall a cartoon from decades ago that showed someone sitting at a terminal and telling the person behind him that this editor (emacs) hits the disc a little too often. The disc pack can be seen flying up from its drive. > -- Dave, a committed VI user (and probably who ought to be committed) N. (mostly emacs these days but I use both #6-) From lm at mcvoy.com Fri May 22 02:30:42 2020 From: lm at mcvoy.com (Larry McVoy) Date: Thu, 21 May 2020 09:30:42 -0700 Subject: [TUHS] History of popularity of C In-Reply-To: <8a2e9b1b-8890-a783-5b53-c8480c070f2e@telegraphics.com.au> References: <8a2e9b1b-8890-a783-5b53-c8480c070f2e@telegraphics.com.au> Message-ID: <20200521163042.GQ12554@mcvoy.com> On Thu, May 21, 2020 at 12:10:35PM -0400, Toby Thain wrote: > On 2020-05-21 11:27 AM, Tyler Adams wrote: > > Does anybody have any good resources on the history of the popularity of > > C? I'm looking for data to resolve a claim that C is so prolific and > > influential because it's so easy to write a C compiler. > > > > Tyler > > Based on recollections of C from mid-1980s until today, this claim > doesn't make sense for several reasons. Sorry, this is all anecdata or > recollection, not cited data: > > - inexpensive compiler availability was not very good until ~1990 or > later, but C had been taking off like wildfire for 10 years before that > > - developing good compilers is certainly not "easy" - and there were a > lot of mediocre vendor compilers despite (duplicated) investment > > - by the time gcc was mature (by some definition, but probably before > 1990) - something that happened largely as a reaction to the vendor > compiler situation - it was a large and complicated codebase even by > standards of the time > > - hobby/novelty/small/educational compilers are a relatively new thing > and arrived long after the C adoption curve was complete. The earliest > well known example I can think of is lcc (1994) but most are much newer. > > ...and probably quite a few other points. This matches my memory as well. I think I learned C in 1983 or 84, it just worked. To me it felt like it was PDP-11 assembler only nicer. The thing I liked about C is that you always felt like you were right on the metal, it didn't hide the fact that there was a computer under it. Very different feel from, say, Pascal. I think the fact that you could feel the machine under the language had a lot to do with it taking off. And what Toby said about compilers, oh, man, so true. Once you got out in the real world, gcc was buggy and slow, companies wanted to charge you at every step of the way for compilers that were marginally better than gcc at best. When gcc finally got good enough, I agree, around 1990 or so, it was a relief. You just used it and ignored the platform specific ones. G++ took a long time to be good enough. From dot at dotat.at Fri May 22 02:43:03 2020 From: dot at dotat.at (Tony Finch) Date: Thu, 21 May 2020 17:43:03 +0100 Subject: [TUHS] History of popularity of C In-Reply-To: <8a2e9b1b-8890-a783-5b53-c8480c070f2e@telegraphics.com.au> References: <8a2e9b1b-8890-a783-5b53-c8480c070f2e@telegraphics.com.au> Message-ID: Toby Thain wrote: > > - inexpensive compiler availability was not very good until ~1990 or > later, but C had been taking off like wildfire for 10 years before that I get the impression that an important part of its popularity was how C (and C++) became the language of choice on the PC, and displaced Pascal in the process. Tony. -- f.anthony.n.finch http://dotat.at/ Trafalgar: Northerly or northwesterly, backing southwesterly in northwest, 3 to 5. Moderate, occasionally slight in southeast. Fair. Good. From lars at nocrew.org Fri May 22 02:59:25 2020 From: lars at nocrew.org (Lars Brinkhoff) Date: Thu, 21 May 2020 16:59:25 +0000 Subject: [TUHS] Chaos networking in 8th edition (and in 7th) In-Reply-To: (Dave Horsfall's message of "Fri, 22 May 2020 01:18:39 +1000 (EST)") References: <202005200329.04K3TmBF006901@tahoe.cs.Dartmouth.EDU> Message-ID: <7wr1vdgr6a.fsf@junk.nocrew.org> Dave Horsfall wrote: > My EMACS gatherings: > > EMACS - eight megs and constantly swapping > ... GNU Emacs comes with a file called JOKES which has most of those. A selection: Emacs Means A Crappy Screen Egregious Managers Actively Court Stallman Generally Not Used Except by Middle Aged Computer Scientists Emacs Makers Are Crazy Sickos From arnold at skeeve.com Fri May 22 03:35:50 2020 From: arnold at skeeve.com (arnold at skeeve.com) Date: Thu, 21 May 2020 11:35:50 -0600 Subject: [TUHS] History of popularity of C In-Reply-To: References: <8a2e9b1b-8890-a783-5b53-c8480c070f2e@telegraphics.com.au> Message-ID: <202005211735.04LHZoUr006011@freefriends.org> > Toby Thain wrote: > > > > - inexpensive compiler availability was not very good until ~1990 or > > later, but C had been taking off like wildfire for 10 years before that PCC contributed to this. Everybody and their brother was porting Unix to their fancy new CPU architecture / hardware. All you had to do was bootstrap a cross-compiler version of PCC on a PDP-11 (or more likely Vax), then get Unix to boot and Voila. (I remember reading a paper about how Motorola did just that for the MC 680x0 family.) C and Unix were established in Academia and Industry well before 1990. > I get the impression that an important part of its popularity was how C > (and C++) became the language of choice on the PC, and displaced Pascal in > the process. C++ became the language of choice on the PC when MSFT started pushing its compiler and Visual Studio IDE. At least, this is my two cents. Arnold From jnc at mercury.lcs.mit.edu Fri May 22 04:28:17 2020 From: jnc at mercury.lcs.mit.edu (Noel Chiappa) Date: Thu, 21 May 2020 14:28:17 -0400 (EDT) Subject: [TUHS] History of popularity of C Message-ID: <20200521182817.08C0318C093@mercury.lcs.mit.edu> > From: Tyler Adams > C is so prolific and influential because it's so easy to write a C > compiler. I'm not sure the implied corollary ('it's _not_ easy to write compilers for other languages') is correct. As a datapoint, I pulled "Algol 60 Implementation" (Randell and Russell) off the shelf, and it reveals that the Algol 60 compiler discussed there (for the KDF9), using lessons from the Algol compiler for the Electrologica X1, was 3600 words (roughly 3 instructions/word). So it was small. Now, small is not necessarily equivalent to easy, but it was clearly not a mountainous job. I imagine early BCPL, etc compilers were roughly similar. The only language from that era which I can think of which was a slog, compiler-wise, was PL/I. I suspect the real reason for C's sucess was the nature of the language. When I first saw it (ca. 1976), it struck me as a quantum improvement over its contemporaries. Noel From jfoust at threedee.com Fri May 22 03:22:17 2020 From: jfoust at threedee.com (John Foust) Date: Thu, 21 May 2020 12:22:17 -0500 Subject: [TUHS] History of popularity of C In-Reply-To: <20200521163042.GQ12554@mcvoy.com> References: <8a2e9b1b-8890-a783-5b53-c8480c070f2e@telegraphics.com.au> <20200521163042.GQ12554@mcvoy.com> Message-ID: <20200521183133.574EE9C6FC@minnie.tuhs.org> At 11:30 AM 5/21/2020, Larry McVoy wrote: >This matches my memory as well. I think I learned C in 1983 or 84, >it just worked. To me it felt like it was PDP-11 assembler only nicer. One thing that stuck with me about our experience at UW-Madison at that time was that there wasn't a course that taught C yet some courses were taught in C. "Here's K&R, there's the Unix manuals, get to it." >When gcc finally got good enough, I agree, around >1990 or so, it was a relief. You just used it and ignored the platform >specific ones. G++ took a long time to be good enough. There's the broader history of the languages that were popular in the IBM PC market in the 80s and 90s, too. In that at least numerically larger market, there were times when C was not on top for many small-time developers. Let's not forget Turbo Pascal (shipped 1983 to 1995) and Turbo C and C++ (1987-1995). In 1986 or so on the PC, I was using the Gimpel C-terp interpreted C and their fine PC-lint to speed development (which Clem Cole has mentioned here before and which is still sold (!) ) in conjunction with shipping code under the Lattice and Microsoft C compilers of that time. In the mid- to late 80s, there's the rise of the flat address space 68000 machines like Amiga and Atari which could enjoy the cross-pollination of code ported from Unix C environments. On the Mac, Apple's MacApp environment was their Object Pascal and not C++ until 1991. Think C came out in 1986. In the late 1980s, 32-bit DOS extenders arose that let you write DOS programs in C that had true 32-bit pointers and didn't need to worry about 64K segments as much, followed by Microsoft's Win32s in late 1992 that allowed that freedom under Windows 3.1. - John From dfawcus+lists-tuhs at employees.org Fri May 22 04:39:46 2020 From: dfawcus+lists-tuhs at employees.org (Derek Fawcus) Date: Thu, 21 May 2020 19:39:46 +0100 Subject: [TUHS] EMACS (was Re: Chaos networking in 8th edition (and in 7th)) In-Reply-To: References: <202005200329.04K3TmBF006901@tahoe.cs.Dartmouth.EDU> Message-ID: <20200521183946.GA33018@clarinet.employees.org> On Fri, May 22, 2020 at 01:18:39AM +1000, Dave Horsfall wrote: > > My EMACS gatherings: > > EMACS - eight megs and constantly swapping The version I recall is "eats memory and constantly swaps" DF From thomas.paulsen at firemail.de Fri May 22 04:44:42 2020 From: thomas.paulsen at firemail.de (Thomas Paulsen) Date: Thu, 21 May 2020 20:44:42 +0200 Subject: [TUHS] History of popularity of C In-Reply-To: <20200521182817.08C0318C093@mercury.lcs.mit.edu> References: <20200521182817.08C0318C093@mercury.lcs.mit.edu> Message-ID: <97649ed8e0655b6d875135c20fe8062e@firemail.de> >I suspect the real reason for C's sucess was the nature of the language. it has most of the elements of structured programming as known in the 70the/80ths, and - most important - it produces small and fast performing binaries like no other high level language. Furthermore its syntax is relatively close to the system, and systems calls are easily adoptable. Thus for me it still is and ever will be the first choice. From paul.winalski at gmail.com Fri May 22 04:50:17 2020 From: paul.winalski at gmail.com (Paul Winalski) Date: Thu, 21 May 2020 14:50:17 -0400 Subject: [TUHS] Chaos networking in 8th edition (and in 7th) In-Reply-To: References: <202005200329.04K3TmBF006901@tahoe.cs.Dartmouth.EDU> Message-ID: On 5/21/20, Dave Horsfall wrote: > > My EMACS gatherings: > > EMACS - eight megs and constantly swapping > > "Enough Memory? A Concept Strange!" > > I thought it stood for Escape-Meta-Alt-Control-Shift > > Emacs Makes A Computer Slow > > Eventually Munches All Computer Storage > > More contributions welcome... I'll put 'em on my pitiful excuse for a web > page some day. Escape-Meta-Alt-Control-Shift -Paul W. From a.phillip.garcia at gmail.com Fri May 22 04:58:01 2020 From: a.phillip.garcia at gmail.com (A. P. Garcia) Date: Thu, 21 May 2020 14:58:01 -0400 Subject: [TUHS] History of popularity of C In-Reply-To: References: Message-ID: >From memory, there is a History of Programming Languages book from an ACM conference that contains some papers that were presented there, along with some notes from Q&A sessions that followed. I'm paraphrasing, but Dennis Ritchie said something flattering about Pascal, that it was essentially the same language as C. Given this, asked Niklaus Wirth, why do you suppose that C is so much more popular than Pascal? Ritchie answered, "I don't know". My personal opinion is that Ken Thompson is not given enough credit for the beauty and expressiveness of C, as much of this comes from its predecessor, B, which is essentially Thompson's "remix" of BCPL. On Thu, May 21, 2020, 11:28 AM Tyler Adams wrote: > Does anybody have any good resources on the history of the popularity of > C? I'm looking for data to resolve a claim that C is so prolific and > influential because it's so easy to write a C compiler. > > Tyler > -------------- next part -------------- An HTML attachment was scrubbed... URL: From clemc at ccc.com Fri May 22 05:02:31 2020 From: clemc at ccc.com (Clem Cole) Date: Thu, 21 May 2020 15:02:31 -0400 Subject: [TUHS] History of popularity of C In-Reply-To: References: Message-ID: On Thu, May 21, 2020 at 11:28 AM Tyler Adams wrote: > Does anybody have any good resources on the history of the popularity of > C? I'm looking for data to resolve a claim that C is so prolific and > influential because it's so easy to write a C compiler. > Hmmmm, I don't know what's been written, but old Dr. Dobbs and Byte Mag are where I would start from 1975-85 (which I have in my attic, but lack a good index). Let me give you my experience and recollections, although Larry may scream that catalog of memories he is colored by what he likes to call the UNIX club. Academics in the mid-late 70s all got UNIX with full sources to originally the Ritchie C Compiler and later the Johnson compiler. Plus had access to yacc/lex and the first editions of the dragon book. Before C (or B) shows up there already were 'system programming languages' such as BCPL, BLISS, PL/360, and ESPOL (much leads full languages like Fortran, Algol family, PL/1) which people were also trying to use for systems work. In '73, C had been retargeted for the PDP-10 by Alan Snyder @ MIT https://github.com/PDP-10/Snyder-C-compiler, but I believe that was a rewrite not a port of the Ritchie compiler. I believe this was the first retarget, at least outside of the MH. Before that C has been retargeted to the Honeywell, Interdata and I believe the S/360 -- Steve and Doug can probably say more. The 8-bit microprocessor arrives on the scene in 75 and the 16-bit ones 4 years later. Many of us at different universities wrote assemblers and linkers for the same, often in C under UNIX. CMU had a SAIL based 6502 assembler in the CS Dept that was used to burn ROMs, but it ran on the PDP-10. There must have been an 8080 assembler over there too, but I don't remember it. The 10s were more difficult to use in the EE building and tough to use with the KIM-1s. I wrote one for the 6502, 8085, and the Z80 for the EE department on our 11/34 UNIX V6 system. Ted Kowalski wrote the predecessor to the eventual UNIX cu(1) program, which we called connect(1) that allowed us to download code to the KIM's (and other micros) from the UNIX systems that we had in the EE lab (which he took back to USG, was rewritten and went into both PWB and eventually TS and V7). The Purdue 8-bit micro suite would eventually become popular because it supported full relocation and linker, which microprocessor support tools like the one I wrote did not. There was group at Purdue in EE that started to retarget the Ritchie C Compiler, but I've lost track what happen. Mike Zuhl or Ward Cummingham might remember what became of that (more in a minute) - I'm pretty sure Ward was mixed up in the that -- check his web site you might find stuff there, or we can ask either of them (Steinhart might know some of it too, as we all working with the original Microprocessor team in Tek Walker Road in the late 1970s). The first microprocessor targeted C compiler I personally used was the one from Teletype Corporation which had retargeted the Ritchie C Compiler to the Z80 in 1977/78 IIRC (that Phil Karn brought to CMU). He and I hacked it to use my assembler and got it to spit out 8085 code for our semester project for Steely Dan's Real-Time course. This was the original C compiler he used for the KA9Q TCP/IP, although at some point he switched Leor Zohlman's Brain-Damaged Systems (BDS) compiler ( https://en.wikipedia.org/wiki/BDS_C) after we both left CMU. In the late 1970s ('78 I think), Dennis Allison was teaching a course at Stanford. The assignment was to developed TinyBasic (for the 6502 IIRC). Some of these got presented at an early AMW (talk to Bob Berg if you want try to find the date). This idea spread to a lot of places and the idea of 'TinyX' or SmallX was started. By the late 1979/early 1980, Ron Cain (one of his students I believe) used an SRI based UNIX system to develop his 'Small C' that he would publish the sources to in Byte and eventually a book that was used to teach (which I still have): https://en.wikipedia.org/wiki/Small-C. The Small C compiler would get retargeted to the other 8-bit micros and you can find most of them with a search engine. The best I can tell, Leor and Ron worked independently of each other. Leor's compiler was a tad more complete and he actually wrote a UNIX clone for the Z80 with it (I don't remember if Leor has fp support, Ron did not). Leor had access to the Ritchie compiler, but he seems to have written it himself (you can search for and download the sources and decide yourself). Leor showed many of us his systems running on 3 8" floppies at the Boston USENIX in the early 1980s [I remember dmr playing with it and remarked how much it reminded him of early UNIX on the PDP-11]. Also, after I left CMU in 1979, I took the Ritchie compiler and retargeted to what would become the 68000 (it was not yet released or numbered when I started). Paul Blanter of Tek labs wrote the assembler and Steve Glaser and I hacked v7's ld a little. This was the original tool suite for the Magnolia system. The folks in the MIT RTS Group had started to retarget the Johnson compilers to the 8086, the 68000 and eventually the Z8000 as part of the NU project and Trix (I know Jack Test, who had previously been at Stanford had is hand in this -- tjt wrote the MIT 68000 assembler that used an MIT hacked version of V7's old, I think John Siber did the C8086). Around the same time, CMU started the Mach project and created the macho format. Robert Baron and Mike Accetta were heavily involved, but I think they took the MIT compilers as the basis for some of that work. At some point (Steve can fill us in) I thought someone in USG started to retarget his compilers for USG. This is the source of the AT&T assembler and is what ISC started with when they did the 386 ports a few years later for AT&T that Heinz talked about a few weeks ago. Meanwhile, Gordon Letwin who had been Purdue, EE, brought the Purdue assemblers and forked from the C compiler work at some point. He and Bob Greenberg did the start of the compilers for original Xenix work for the 8086, we would have to ask Bob or Gordon for more details [Gordon is believed to be the source the terrible curse, called the 'far' pointer]. By the early 1980s, a number of UNIX ports start and many C Compilers show up. I think the John Bass did the Onyx Z8000 C compiler independent of the MIT code base, but the MIT NU C compilers and the NU UNIX port would become used by a lot of the 'JAWS' work that would start to ramp in the early 1980s. Anyway -- the point is we all had access to the UNIX sources (sorry Larry) we start to hack on them. Plus different Universities doing compiler work, like Andy Tannenbaum release compilers (ACK) independent of the AT&T code origin but built/bootstrapped from UNIX/the UNIX toolkit. Waterloo, Edinburgh, and others also all put something out. Plus you start to commercial C implementations like Intermetrics, Tartan Labs, Greenhills (in fact IIRC the Apple Mac C Compiler was developed under contract by Greenhills). What I am leaving out is the BASIC and Pascal wars that were going on at the same time. The 8-bits micros, in particular, went BASIC crazy. The 'CS types' at many Universities (like mine at CMU) had been considered BASIC, C, and Fortran as 'ugly' and were using an Algol or a more Algol-like language as the future (Pascal was premier teaching language at the time). For issues, we can talk about in COFF, Pascal diverged (in 1980 at one of the Hatfield and McCoy parties at Steve Glaser's, a couple of us counted 14 incompatible 'HP-BASIC's and 8 different 'Tek Pascal' in use). Here comes the final thing that happened... By the early, mid-80s, all us UNIX folks were happy using UNIX derived C compiler, like the NU suite. But as Larry points out, there was a whole group of people that could not get UNIX sources or tools. Stallman sets out to build his Gnu system and he needs a language and compiler. I've always been amazed he did not use LISP, other than the first tool he wanted was EMACS, and get got CMU (Gosling's) codebase to start. CMU-EMACS was in C (plus his 'mock-lisp' creation). So rms needed a C compiler and starts to hack mock-lisp to be more to his taste. But to make it widespread he needs a C compiler and microprocessor tools. So he starts to write his famous compiler -- which to me is that key thing he did. The Gnu project would release tools that ran pretty much anywhere and targeted the popular micro's and generated 'good enough code.' Cole's Law -- 'Simple Economics always beats Sophisticated Architecture.' It turns out Paul Winalski and I were just talking about this last week. I very much believe C 'won' the war for economic reasons. UNIX being 'Open Source' in the 70s to the Unversity types, did allow us to hack and >>share<< the compilers, either Ritchie or Johnson based. Moore's Law caused the 16-bit micros to flourish and they ended up in systems. Unix taught a number of programmers the language and the tool suite, then we went to the real world and wanted it. Stallman's tools were there. It did not matter that there were 'better' languages (Pascal had forked, we also had new languages from OCAM to Modula, eventually C++ et al). The Gnu C compiler was cheap (free) and that was the final stroke. Clem -------------- next part -------------- An HTML attachment was scrubbed... URL: From paul.winalski at gmail.com Fri May 22 05:06:08 2020 From: paul.winalski at gmail.com (Paul Winalski) Date: Thu, 21 May 2020 15:06:08 -0400 Subject: [TUHS] History of popularity of C In-Reply-To: <97649ed8e0655b6d875135c20fe8062e@firemail.de> References: <20200521182817.08C0318C093@mercury.lcs.mit.edu> <97649ed8e0655b6d875135c20fe8062e@firemail.de> Message-ID: On 5/21/20, Thomas Paulsen wrote: >>I suspect the real reason for C's sucess was the nature of the language. > it has most of the elements of structured programming as known in the > 70the/80ths, and - most important - it produces small and fast performing > binaries like no other high level language. Sorry, but I can't agree with that statement (like no other high-level language). C is a decent language for systems programming but so are other languages such as BLISS. C is a terrible language if you have to throw arrays around (which is why Fortran still rules the roost in HPTC). C, Pascsal, and other modern Algol-ish languages have well-behaved grammars and were designed to be easy to lex and parse. Fortran and COBOL were designed before Chomsky's work on formal grammars became well known, and as a consequence are bears to parse. Fortran has context-sensitive lexical analysis, for example. But nobody knew any better back then. -Paul W. From corky1951 at comcast.net Fri May 22 05:16:26 2020 From: corky1951 at comcast.net (CHARLES KESTER) Date: Thu, 21 May 2020 12:16:26 -0700 (PDT) Subject: [TUHS] History of popularity of C In-Reply-To: <202005211735.04LHZoUr006011@freefriends.org> References: <8a2e9b1b-8890-a783-5b53-c8480c070f2e@telegraphics.com.au> <202005211735.04LHZoUr006011@freefriends.org> Message-ID: <203511515.810845.1590088586459@connect.xfinity.com> > On May 21, 2020 at 10:35 AM arnold at skeeve.com wrote: > > C++ became the language of choice on the PC when MSFT started pushing > its compiler and Visual Studio IDE. Microsoft C 7.0 already had a C++ compiler and an early version of MFC in 1992. But you're right: it was when Visual C++ 1.0 came out in 1993 that C++ became really popular among developers targeting Windows. VC1.0 introduced "wizards" for MFC that produced a skeleton application to which many people had to make only a few additions in order to come up with a shippable product. The market was soon flooded with apps that had what I called a "wizard smell". (The more charitable phrase was "look and feel".) Of course, as with all framework-based code, wizard-generated apps couldn't distinguish themselves in the market for very long and the bar was raised. But by then C++ was well-established as the language of choice. None of which has anything to do with Unix, I admit. From toby at telegraphics.com.au Fri May 22 06:07:14 2020 From: toby at telegraphics.com.au (Toby Thain) Date: Thu, 21 May 2020 16:07:14 -0400 Subject: [TUHS] History of popularity of C In-Reply-To: References: <8a2e9b1b-8890-a783-5b53-c8480c070f2e@telegraphics.com.au> Message-ID: <9686cac5-688b-b2bc-6240-484999e7dc8c@telegraphics.com.au> On 2020-05-21 12:43 PM, Tony Finch wrote: > Toby Thain wrote: >> >> - inexpensive compiler availability was not very good until ~1990 or >> later, but C had been taking off like wildfire for 10 years before that > > I get the impression that an important part of its popularity was how C > (and C++) became the language of choice on the PC, and displaced Pascal in > the process. Yes, that's basically true, but I didn't try to cover the contemporary "appeal" of C, stylistic or otherwise - but only the compiler point (which I think is mostly false). --Toby > > Tony. > From toby at telegraphics.com.au Fri May 22 06:09:51 2020 From: toby at telegraphics.com.au (Toby Thain) Date: Thu, 21 May 2020 16:09:51 -0400 Subject: [TUHS] History of popularity of C In-Reply-To: <202005211735.04LHZoUr006011@freefriends.org> References: <8a2e9b1b-8890-a783-5b53-c8480c070f2e@telegraphics.com.au> <202005211735.04LHZoUr006011@freefriends.org> Message-ID: <0d38d831-42bb-0371-38ec-2fbadfd26c7d@telegraphics.com.au> On 2020-05-21 1:35 PM, arnold at skeeve.com wrote: >> Toby Thain wrote: >>> >>> - inexpensive compiler availability was not very good until ~1990 or >>> later, but C had been taking off like wildfire for 10 years before that > > PCC contributed to this. Everybody and their brother was porting Unix > to their fancy new CPU architecture / hardware. All you had to do was > bootstrap a cross-compiler version of PCC on a PDP-11 (or more likely > Vax), then get Unix to boot and Voila. > > (I remember reading a paper about how Motorola did just that for > the MC 680x0 family.) > Yes, but Johnson had already done the work. Imho compilers were still considered pretty complex magic and you wouldn't lightly write one from scratch. And yeah all the vendors wanted to get a compiler out with minimal effort, which is why they often weren't very good. > C and Unix were established in Academia and Industry well before 1990. > >> I get the impression that an important part of its popularity was how C >> (and C++) became the language of choice on the PC, and displaced Pascal in >> the process. > > C++ became the language of choice on the PC when MSFT started pushing > its compiler and Visual Studio IDE. That was much later. --Toby > > At least, this is my two cents. > > Arnold > From dot at dotat.at Fri May 22 06:12:12 2020 From: dot at dotat.at (Tony Finch) Date: Thu, 21 May 2020 21:12:12 +0100 Subject: [TUHS] History of popularity of C In-Reply-To: <202005211735.04LHZoUr006011@freefriends.org> References: <8a2e9b1b-8890-a783-5b53-c8480c070f2e@telegraphics.com.au> <202005211735.04LHZoUr006011@freefriends.org> Message-ID: arnold at skeeve.com wrote: > > > I get the impression that an important part of its popularity was how C > > (and C++) became the language of choice on the PC, and displaced Pascal in > > the process. > > C++ became the language of choice on the PC when MSFT started pushing > its compiler and Visual Studio IDE. C was winning years before that. I saw a comment on a certain orange website that referred to Dr Dobbs Journal, August 1986, which I found online at https://archive.org/details/dr_dobbs_journal_vol_11/page/n541/mode/1up On that page there are a few choice quotes from the archives (1983) about C from a PC perspective. The letters pages are 1/3 C. There are 8/10 pages of articles about C. Then there is a 23 page comparative review of 17 C compilers. It's remarkable :-) Tony. -- f.anthony.n.finch http://dotat.at/ Irish Sea: Southeast 3 or 4, increasing 5 to 7, veering southwest 6 to gale 8 later. Smooth or slight, becoming moderate or rough, occasionally very rough later in south. Fair then rain or squally showers. Good occasionally poor. From toby at telegraphics.com.au Fri May 22 06:17:32 2020 From: toby at telegraphics.com.au (Toby Thain) Date: Thu, 21 May 2020 16:17:32 -0400 Subject: [TUHS] History of popularity of C In-Reply-To: <20200521183133.574EE9C6FC@minnie.tuhs.org> References: <8a2e9b1b-8890-a783-5b53-c8480c070f2e@telegraphics.com.au> <20200521163042.GQ12554@mcvoy.com> <20200521183133.574EE9C6FC@minnie.tuhs.org> Message-ID: On 2020-05-21 1:22 PM, John Foust wrote: > ... > In the mid- to late 80s, there's the rise of the flat address space > 68000 machines like Amiga and Atari which could enjoy the > cross-pollination of code ported from Unix C environments. > > On the Mac, Apple's MacApp environment was their Object Pascal > and not C++ until 1991. Think C came out in 1986. Few developers used MacApp, afaicr. Most used plain Pascal - initially Lisa Pascal, though that was before my time on Mac - and those who didn't like Pascal could write C on Mac before THINK, using tools like Aztec C (or even Whitesmiths, one of the earliest). Even though MPW was an excellent industrial strength environment with good Pascal and C compilers, the big vendors like Adobe adopted LIGHTSPEED-then-THINK-then-Symantec C quickly and rewrote Pascal apps (like Photoshop) in C early. Then CodeWarrior came along and ate THINK's lunch. --Toby > > ... > > - John > From thomas.paulsen at firemail.de Fri May 22 06:27:21 2020 From: thomas.paulsen at firemail.de (Thomas Paulsen) Date: Thu, 21 May 2020 22:27:21 +0200 Subject: [TUHS] History of popularity of C In-Reply-To: References: <20200521182817.08C0318C093@mercury.lcs.mit.edu> <97649ed8e0655b6d875135c20fe8062e@firemail.de> Message-ID: >Sorry, but I can't agree with that statement (like no other high-level >language). C is a decent language for systems programming but so are >other languages such as BLISS. C is a terrible language if you have >to throw arrays around (which is why Fortran still rules the roost in >HPTC). BLISS is known to a very small number of persons, thus irrelevant, and with regards to arrays, first they are rarely used in advanced programming preferring lists, maps, trees, etc.. second I never had problems with pointers. From thomas.paulsen at firemail.de Fri May 22 06:33:38 2020 From: thomas.paulsen at firemail.de (Thomas Paulsen) Date: Thu, 21 May 2020 22:33:38 +0200 Subject: [TUHS] History of popularity of C In-Reply-To: <203511515.810845.1590088586459@connect.xfinity.com> References: <8a2e9b1b-8890-a783-5b53-c8480c070f2e@telegraphics.com.au> <202005211735.04LHZoUr006011@freefriends.org> <203511515.810845.1590088586459@connect.xfinity.com> Message-ID: >Microsoft C 7.0 already had a C++ compiler and an early version of MFC in >1992. But you're right: it was when Visual C++ 1.0 came out in 1993 that C++ became >really popular among developers targeting Windows. VC1.0 introduced "wizards" msc was really good in those days. As a systems guy I used to study its generated assembly code which was extremely good. However today's gcc uses advanced instructions too, thus also very good, whereas all the unix cc's of the 90ths known to me were rather naive, simple lex&yacc derived. The "wizards" also were very good making gui programmig much easier. From clemc at ccc.com Fri May 22 06:56:00 2020 From: clemc at ccc.com (Clem Cole) Date: Thu, 21 May 2020 16:56:00 -0400 Subject: [TUHS] History of popularity of C In-Reply-To: <8a2e9b1b-8890-a783-5b53-c8480c070f2e@telegraphics.com.au> References: <8a2e9b1b-8890-a783-5b53-c8480c070f2e@telegraphics.com.au> Message-ID: On Thu, May 21, 2020 at 12:17 PM Toby Thain wrote: > - inexpensive compiler availability was not very good until ~1990 or > later, Hrrumpt The Gnu C compiler was starting to be available by the mid-1980s in alpha/beta form. rms was looking for places to start. He approached a number of folks, from Tanenbaum to some of the vendors (he knew Masscomp had written a compiler from scratch which we away the binaries gave to our customers and he called me asking if we would donate it. We had donated development hardware and I was still his contact to the Gnu project at that point). As far as I know, he ended up writing his own because he could not find one to start with. The big kickstart for rms, was that Sun hard just started to charge for its compilers, and so a lot of people were looking for a free alternative (and frankly in those days the Sun compiler was still a bit of a toy -- 20% we got over them at Masscomp was because we had a number of the folks from the DEC compiler team). It is true that the targets and the original systems it ran were more limited. The 1.0 release was before the summer of '87 (in May maybe???). The biggest issue is that it did not run on DOS until the 386 and the DOS-extenders show up. But it covered the many 68000 workstations and was often as good or better than the supplied one [which were mostly based/derived from the MIT Jack Test port of the Johnson compiler for the NU system]. > but C had been taking off like wildfire for 10 years before that > At least 15 years before. By 1975, it was a solid fixture at most Universities. > - by the time gcc was mature (by some definition, but probably before > 1990) Mature is the key word here. gcc does not really start to mature until Cygnus takes it over. But it was quite usable for the systems that targetted it. -------------- next part -------------- An HTML attachment was scrubbed... URL: From tk at research.att.com Fri May 22 03:25:01 2020 From: tk at research.att.com (thomas kirk) Date: Thu, 21 May 2020 13:25:01 -0400 Subject: [TUHS] Chaos networking in 8th edition (and in 7th) In-Reply-To: References: <202005200329.04K3TmBF006901@tahoe.cs.Dartmouth.EDU> Message-ID: <24262.47469.805801.16547@alice.research.att.com> there were several symbolics machines nearby in 1125/6. no reason for any of them to be using chaosnet, but some probably did anyway. Rob Pike writes: > I have vague memories of one in 1135 or 1138. Could just be thinking of the > Symbolics poster by Bart's desk, though, the one advertising Emacs's "over > 400 easy to use commands". > > -rob > > > On Wed, May 20, 2020 at 1:30 PM Doug McIlroy wrote: > > > >> I don't recall any use of Chaos in 1127. Possibly one of > > >> the nearby groups who also used the Research system needed > > >> it at some point > > > > > Speculating wildly, maybe there was a Lisp machine somewhere? > > > > None that I can remember. > > > > Doug > > From toby at telegraphics.com.au Fri May 22 09:45:36 2020 From: toby at telegraphics.com.au (Toby Thain) Date: Thu, 21 May 2020 19:45:36 -0400 Subject: [TUHS] History of popularity of C In-Reply-To: References: <8a2e9b1b-8890-a783-5b53-c8480c070f2e@telegraphics.com.au> Message-ID: On 2020-05-21 4:56 PM, Clem Cole wrote: > > > On Thu, May 21, 2020 at 12:17 PM Toby Thain > wrote: > > - inexpensive compiler availability was not very good until ~1990 > orlater, > > Hrrumpt  The Gnu C compiler was starting to be available by the > mid-1980s in alpha/beta form. rms was looking for places to start.  He Right, things were changing, but costly C compilers were a reality well into the 90s, unless your use case happened to coincide with a gcc port. And the reason this matters is that it contradicts the "C is popular because compilers were easy" assertion. Not "easy", and not necessarily cheap or free either. > approached a number of folks, from Tanenbaum to some of the vendors (he > knew Masscomp had written a compiler from scratch which we away the > binaries gave to our customers and he called me asking if we would > donate it.  We had donated development hardware and I was still his > contact to the Gnu project at that point). > > As far as I know, he ended up writing his own because he could not find > one to start with.  ... > > > >   > > but C had been taking off like wildfire for 10 years before that > > At least 15 years before.  By 1975, it was a solid fixture at most > Universities. Yes. I should have said "more than 10" :-) --Toby > >   > > - by the time gcc was mature (by some definition, but probably > before1990) > > Mature is the key word here.   gcc does not really start to mature until >  Cygnus takes it over.  But it was quite usable for the systems that > targetted it. From rich.salz at gmail.com Fri May 22 09:57:11 2020 From: rich.salz at gmail.com (Richard Salz) Date: Thu, 21 May 2020 19:57:11 -0400 Subject: [TUHS] History of popularity of C In-Reply-To: References: <8a2e9b1b-8890-a783-5b53-c8480c070f2e@telegraphics.com.au> Message-ID: Was the fact that gcc had the "portable" RTL as an intermediate representation important? That it was designed to be ported. And what about John Gilmore making all bsd user it? And the multiple usenix tutorials? -------------- next part -------------- An HTML attachment was scrubbed... URL: From toby at telegraphics.com.au Fri May 22 10:17:46 2020 From: toby at telegraphics.com.au (Toby Thain) Date: Thu, 21 May 2020 20:17:46 -0400 Subject: [TUHS] History of popularity of C In-Reply-To: References: <8a2e9b1b-8890-a783-5b53-c8480c070f2e@telegraphics.com.au> Message-ID: <021e1fdd-e41a-9b79-0f34-cb07615c4eaa@telegraphics.com.au> On 2020-05-21 7:57 PM, Richard Salz wrote: > Was the fact that gcc had the "portable" RTL as an intermediate > representation important? That it was designed to be ported. > > And what about John Gilmore making all bsd user it? And the multiple > usenix tutorials? Regardless of one's opinions on the ubiquity of gcc it wasn't mature and accessible until at least 10-15 years after C was already popular (depending how you count). And gcc is hardly an "easy" compiler project ... to the OP's question. --Toby From gnu at toad.com Fri May 22 14:10:06 2020 From: gnu at toad.com (John Gilmore) Date: Thu, 21 May 2020 21:10:06 -0700 Subject: [TUHS] History of popularity of C In-Reply-To: References: <8a2e9b1b-8890-a783-5b53-c8480c070f2e@telegraphics.com.au> Message-ID: <5106.1590120606@hop.toad.com> Richard Salz wrote: > And what about John Gilmore making all bsd user it? And the multiple usenix > tutorials? I think Rich is referring to the time in 1987-8 when I spent some time compiling the entire BSD distribution sources with the Vax version of gcc. This was a volunteer effort on my part so that Berkeley could adopt GCC to replace PCC. They got an ANSI C compiler, and avoided AT&T copyright restrictions on Yet Another critical piece of Berkeley Unix. GNU got an extensive test of GCC which moved it out of "beta" status. I ended up taking extensive notes, and wrote a 1988 paper about the experience, which I submitted to USENIX. But it was rejected, on the theory that porting code (even ancient crufty Unix code) through new compilers wasn't research. Indeed, I recall Kirk McKusick remarking to me around that time that even Unix kernel ports to new architectures were so routine as to not be research in his opinion. Oddly, I was easily able to find that paper (thanks to Kryder's Law), so I have appended it verbatim below (in troff with -ms macros). In short, I found about a dozen bugs in GCC, which RMS fixed; and many hundreds of bugs in the 4.3BSD Unix sources, which I fixed and Keith merged upstream. Note the quaint footnoted homage to distributed collaboration, which was still novel back then in the pre-Covid, pre-public-Internet, 2400 baud modem era. John .TL Porting Berkeley .UX through the GNU C Compiler .AU John Gilmore .AI Grasshopper Group San Francisco, CA, USA 94117 gnu at toad.com .AB We have ported UC Berkeley's latest .UX sources through the GNU C Compiler, a free draft-ANSI compatible compiler written by Richard Stallman and available from the Free Software Foundation. In the process, we made Berkeley .UX more compatible with the draft ANSI C standard, and tested the GNU C Compiler for its full production release. We describe the impact of various ANSI C changes on the Berkeley .UX sources, the kinds of non-portable code that the conversion uncovered, and how we fixed them. We also briefly explore some limitations in the tools used to build a .UX system. .AE .SH Introduction .PP The GNU C Compiler (GCC) is a complete C compiler, compatible with the draft ANSI standard, and available in source from the Free Software Foundation (FSF). It was written by Richard Stallman in 1986 and 1987, and is (at this writing) in its 18th release. It is a major component of the GNU (``GNU's Not .UX '') project, whose aim is to build a complete .UX -like software system, available in source to anyone who wants it. The compiler produces good code \(em better than most commercial compilers \(em and has been ported to the Vax, MC680X0, and NS32XXX. .PP Berkeley .UX , from the Computer Systems Research Group (CSRG) at the University of California at Berkeley, had its start in the 1970's with a prerelease .UX Version 7, and has been improving ever since. The current sources derive from the 1978 AT&T ``32V'' release, a V7 variant for the Vax. CSRG has produced four major releases for the Vax \(em 3, 4.1, 4.2, and 4.3BSD. These releases have set the standard for high powered .UX systems for many years, and continue to offer an improved alternative to the flat-tasting AT&T .UX releases. .PP However, Berkeley's C compiler is based on an old version of PCC, the Portable C Compiler from AT&T. There was little chance that anyone would provide ANSI C language extensions in this compiler, or do significant work on optimizing the generated code. By merging the GNU C compiler into the Berkeley release, we provided these new features to Berkeley Unix users at a low cost, while offering the GNU project an important test case for GNU C. .SH Goals .PP The major goal for the project is to move GCC out of ``beta test'' and into ``production'' status, by demonstrating that a successful .UX port can be based on it. .PP We are also providing a better maintained compiler for Berkeley .UX . GCC already produces better object code then the previous compiler, has a more modern internal structure, and supports useful features such as function prototype declarations. It is also maintained by a large collection of people around the world, who contribute their fixes and enhancements to the master sources. Regular releases by the Free Software Foundation encourage distribution of the improvements. In contrast, PCC is proprietary to AT&T, and few fixes are widely distributed, except as part of infrequent and expensive AT&T releases. .PP We are producing a .UX source tree which can be compiled by .I both the old and the new compilers. This is partly for convenience during the port, partly in case the project suffers long delays, and partly because Berkeley .UX also runs on the Tahoe, a fast Vax-like machine built by Computer Consoles, which GCC does not yet support. We are avoiding the introduction of new .B #ifdef 's, instead rewriting the code so that it does not depend on the features of either compiler. .PP We have to constantly remind ourselves to minimize the changes required. It's too easy to get lost in a maze of twisty .UX code, all desperately needing improvement. .PP Whenever we have to make a change, we have moved in the direction of ANSI C and POSIX compatability. .SH People .PP The project was conceived by John Gilmore, and endorsed by Keith Bostic and Mike Karels of CSRG, and Richard Stallman of FSF. John did the major grunt work and provided fixes to the .UX code. Keith and Mike provided machine resources, collaborated on major decisions, and arbitrated the style and content of the changes to .UX . Richard provided quick turnaround on compiler bug fixes and problem solving. This setup worked extremely well. .PP We started work on 17 December 1987, and are not yet done at the time of writing (19 February 1988). About 9 days of my time, 2 of Keith's, half a day of Mike's, and XXX days of Richard's have gone into the project so far. .SH Working Style .PP Most of the work was done over networks, in a loosely coordinated style which was hard to concieve of only a few years ago.\(dg .FS \(dg Much of the free software work that is happening these days occurs in this manner, and I would like to publicly thank the original DARPA pioneers who gave birth to this vision of wide area, computer mediated collaborative work. .FE John worked in San Francisco, Keith in Berkeley, and Richard in Cambridge. Keith set up an account and a copy of the source tree on .I vangogh , a Vax 8600 at Berkeley. John spent a few days in front of a Sun at Berkeley getting things straight, but did most of the work by dialing in at 2400 baud from his office in San Francisco. When we modified .UX source files, Keith checked the changes and merged them back into the master .UX sources on another machine at Berkeley. When we found an apparent bug in GCC, we isolated a small excerpt or test program to demonstrate the bug, and forwarded it to Richard by Internet electronic mail. Bug fixes came back as new GCC releases, which were FTP'd over the Internet from MIT. Ongoing status reports, discussions, and scheduling were done by \fIuucp\fP and Internet electronic mail. .PP At this writing, we have used four GCC releases (1.15 through 1.18). For each GCC release, we did a ``pass'' over the .UX source tree; one such pass included an updated source tree as well. Each GCC release was built, tested, and installed on .I vangogh without trouble. Then we ran .I "make clean; make" on the source tree, and examined 500K to 800K of resulting output. Keith Bostic's Makefiles did an excellent job of automating this process, though we ran into some problems with the .UX compilation model in general, and limitations in .I make in particular. .SH ANSI Language Changes .PP The problems encountered during the port fell into two general categories. Some of the code was not written portably and failed in the new environment. Other code was written portably for its time, but failed because ANSI C has redefined parts of the language. In some cases it was hard to tell the difference; the consensus on what is ``portable code'' changes over time, and on some points there is no agreement. .PP The major ANSI C problem was the generation of .B "character constants in cpp" . The traditional .UX C preprocessor (\fIcpp\fP), written by John F. Reiser, would substitute a macro's parameters into like-named substrings even inside single or double quotes in the macro definition. For example: .DS #define CTRL(c) ('c'&037) #define CEOF CTRL(d) .DE In an attempt to make things easier for tokenizing preprocessors, ANSI C has changed the rules here, and there is in fact .I no way to generate a character constant containing a macro argument. (There is a way to generate a character .I string , e.g. double-quoted string, but not a single-quoted character. We consider this a bug in ANSI C.) Fixing this required altering both the macro definition and each reference to the macro: .DS #define CTRL(c) (c&037) #define CEOF CTRL('d') .DE This required changes in about 10 system include files and in about 45 source modules. Many user programs turned out to depend on the undocumented .B CTRL macro, defined in .B , and since all its callers had to change, all those programs did too. .PP Another \fIcpp\fP problem involved .B "token concatenation" . No formal facilities were provided for this in the old \fIcpp\fP, but many users discovered that with code like this, from the /etc/passwd scanning code: .DS #define EXPAND(e) passwd.pw_/**/e = tp; while (*tp++ = *cp++); EXPAND(name); EXPAND(passwd); .DE they could cause a macro argument to be concatenated with another argument, or with preexisting text, to make a single name. In one case (\fIphantasia\fP), the Makefile provided half of a quoted string as a command line .B #define , and the source text provided the other half! ANSI C does not allow a preprocessor to concatenate tokens in these ways, instead providing a newly invented .B ## operator, and new rules requiring the compiler to concatenate adjacent character strings. Again, it was impossible to write a macro that works with both old and new compilers, and we didn't want to uglify our code with .B "#ifdef __STDC__" ; our solution was to rewrite both the macros and all their callers, to avoid ever having to concatenate tokens: .DS #define EXPAND(e) passwd.e = tp; while (*tp++ = *cp++); EXPAND(pw_name); EXPAND(pw_passwd); .DE Mostly the token concatenation was used as a typing convenience, so this was not a problem. It involved changes to five modules. We found no clean solution for .I phantasia ; a fix will probably involve rewriting it to do explicit string concatenations at runtime. .PP Changes to the .B "scope of externals" provided another set of widely scattered changes. If an external identifier is declared from inside a function, PCC causes that declaration to be visible to the entire remaining text of the source file. This also applies to functions which are implicitly declared when they first appear in an expression. This behaviour was not explicitly sanctioned by K&R, but it was condoned (pg. 206, 2nd paragraph), and many programs depended on it. ANSI C changed the scope rules to be more consistent; if you declare an external identifier in a local block, the declaration has no effect outside the block. We moved extern declarations to global scope, or added global function declarations, in 38 files to handle this. .PP A number of programs used .B "new keywords" such as \fIsigned\fP or \fIconst\fP as identifiers. We renamed the identifiers in 9 modules. .PP The Fortran libraries used a \fBtypedef name as a formal parameter\fP to a set of functions. ANSI C has disallowed this, since it complicates the parsing of the new prototype-style function declarations. We renamed the parameter in 8 modules. .PP Three modules used a \fBtypedef with modifiers\fP, e.g.: .DS typedef int CONSZ; x = (unsigned CONSZ) y; .DE This has been repudiated by ANSI C. We fixed it by making the original typedef \fBunsigned\fP where possible, or by creating a second typedef for ``U_CONSZ''. .SH Non-Portable Constructs .PP The worst non-portable construct we found in the .UX sources was the use of .B "pointers to non-members" . There was plenty of code as bad as: .DS int *foo; foo->memb = 5 if (foo->humbug >= -1) bah(); .DE and, in many cases, \fImemb\fP and \fIhumbug\fP are not even members of the same struct! Such code seems to have been written with a ``BCPL'' mentality, assuming that all pointers are really the same thing and it doesn't matter what their type is. Early C implementations lacked the .B union declarator, and did not distinguish between the members of different structures. Exploiting this has been considered bad practice for years, and lint checks for it, though many .UX compilers do not. We found a lot of it in old code, though newer code did not lack for examples either. Fixing this problem caused the most work, because we had to figure out what each untyped or mistyped pointer was .I really being used for, then fix its type, and whatever references to it were inconsistent with that type. We changed 5 modules due to this. One program, \fIefl\fP, would have required so much work that we abandoned it, since we could not find anyone using it. .PP Another problem was caused by existing uses of .B "cpp on non-C sources" . Various assembler language modules were being preprocessed by \fIcpp\fP, probably because there is no standard macro assembler for .UX . These modules are carefully arranged to avoid confusing the old \fIcpp\fP; for example, assembler language comments are introduced by .B # , but indented so that \fIcpp\fP will not treat them as control lines. ANSI \fIcpp\fP's handle white space on both sides of the ``#'', so indentation no longer hides these comments. Also, the ANSI rules to require the preprocessor to keep track of which material is inside single and double quotes and which is outside; the old \fIcpp\fP terminated a character string or constant at the next unescaped newline. Vax assembler language uses unmatched quotes when specifying single ASCII characters, such as in immediate operands. This causes an ANSI \fIcpp\fP to stop processing # directives at that point, until it finds another unmatched quote. We chose to alter the assembler modules to avoid stumbling over these features in ANSI C preprocessors, without fixing the larger problem of using a C-specific preprocessor on non-C text. .PP In addition to embedded C preprocessor statements in assembler sources, we had to deal with .B "asm() constructs" in C source. Some system-dependent routines were written in C with intermixed assembler code, producing a mess when compiled with anything but the original compiler. Other routines, such as .I compress , drop in an .B asm() here or there as an optimization. Still more modules, including the kernel, run a .I sed script over the assembler code generated by the C compiler, before assembling and linking it. There is no general solution to these problems. GCC has added an asm() facility that is independent of the compiler's register allocation strategy, but programs using this are incompatible with the old C compiler. We are investigating a possible fix involving changing all these places to use e.g. .B "#include " which, in GCC, would define inline code containing asm()s, while in PCC, declarations of (slower) external functions would be generated. .PP .I Troff used .B "multi-character constants" in its font tables; we fixed it with a macro for building an int out of two characters. A Fortran library module used the character constant .B 'EOF' , presumably a typo for .B EOF ; and \fIrogue\fP defined the character '\300' as a possible command letter. While ANSI C permits multiple character constants, they are implementation defined, and GCC wisely defines them to be invalid (as the standard should have done). .PP Some programs tried to declare functions or variables, .B "omitting both type and storage class" . This usage is not even valid in K&R, though PCC accepts it. We fixed this in about 15 modules, by adding ``int'' to the declarations. There were two other modules where this check uncovered inadvertent use of ``;'' in a declaration list where ``,'' was intended. .PP GCC provides better error checking in a few ways, and caught a number of bugs caused by misunderstood .B "sign extension" . It warns ``comparison is always 0 due to limited range of data type'' for constructs like: .DS char c; if (c == 0x80) foo(); .DE If a signed character contains the bit pattern 0x80, using it in an expression causes it to be sign-extended to 0xFFFFFF80, which does not equal 0x00000080. Bugs of this sort were fixed, typically by casting the 0x80 to (char), in 5 modules. .PP Changes to the rules for \fBparsing declarations\fP made us fix two modules where the last declaration in a struct was immediately followed by a closing brace, without a semicolon. Three more modules needed changes because the rules for where braces are required in struct or array initializers have changed. Four programs defined a \fBstruct foo\fP and then referenced it as a \fBunion foo\fP, or vice verse. Two programs declared \fBregister struct foo bar;\fP and then took bar's address, which is not allowed for register variables! .PP Thirteen programs had miscellaneous \fBpointer usage bugs\fP fixed. Two more were comparing pointers to \fB-1\fP; these were changed to use zero as a flag value instead. .PP In ANSI C, local variables in use at a .B setjmp() are no longer guaranteed to be preserved when a .B longjmp() occurs, unless they are declared \fBvolatile\fP. This is not a problem for the Vax port, since the Vax longjmp() will continue to restore the registers, but gcc warns about this situation, since code that assumes restoration is not portable. We have not yet worked on fixes for this. .PP Five or ten other miscellaneous bugs were caught and fixed. .SH Least portable .UX code .PP The process of porting software inevitably uncovers a few files that cause a disproportionate share of problems. For our port, the clear winner is .I efl , the Extended Fortran Language, by Stu Feldman. It defines ``\fBtypedef int * ptr;\fP'' in a header file, and then uses a ``ptr'' to point to anything. GCC produced 1600 lines of errors messages on this program alone, and three modules of it caused compiler core dumps. We ended up deciding to abandon support for it rather than attempt to clean it up. .PP A runner-up is .I pcc , the Portable C Compiler itself, by Steven C. Johnson. It caused GCC to coredump twice, tickled another GCC parsing bug, and contained the modified typedef and sign extension problems mentioned above. .PP Third place goes to .I monop , the Monopoly\(dg .FS \(dg Trademark of Parker Brothers .FE game, by Ken Arnold. This program used a variety of typed pointers, but the main pointer to a set of structs was declared as a \fBchar *\fP. Another part of the code initialized an array of struct pointers with integer values, then a small loop at the beginning of the game would read out these integers and replace them with corresponding ``real'' struct pointers. It took about two days to face up to the job and about a day to clean it up. .PP Honorable mention for silly mistakes goes to the .I indent program, by someone at the University of Illinois. It contain the only instance of .B "a + = b" (with a space between + and =), and was the only module to terminate its .B #include directives with a semicolon. It also contained a comparison between a character and the value 0200, a value that a signed 8-bit char can never hold. .SH Results .PP We are pleased with the results so far. Most of the .UX code compiled without problems, and the parts which we have executed are free from code generation bugs. The worst of the ANSI C changes only required roughly fifty modules to be changed, and there were only two problems of this magnitude. A total of twenty bugs in gcc were located so far, and most of them are now fixed. We expected several times this many bugs; the compiler is in better shape than any of us expected. .PP Many minor type problems and ``nit'' incompatabilities with ANSI C have been removed from the .UX sources. .SH Future Results .PP \fI(This section will move to \fBResults\fP for the final paper.)\fP .PP We expect that the size of the .UX binaries will be significantly less than with the previous compiler, but at the current stage of the project we can't easily confirm the expectation. .PP When the system compiled with GCC is in everyday use at Berkeley, GCC will be relabeled as a full production-quality compiler, which will encourage its wider use. .SH Non-Results .PP We have not attempted to make Berkeley .UX fully ANSI C compliant. In particular, we have retained preprocessor comments (#endif FOO) as well as machine-specific \fB#define\fP's (#ifdef vax). GCC supports these features without trouble, even though ANSI C does not. .PP The .UX kernel has not yet been ported to gcc. Other people are working on this, compiling one module at a time and running it for a while before moving on to the next. We will merge their work with ours once we have the rest of the system in a stable state. .PP Pieces of the Portable C Compiler are still being used inside .I "lint, f77" , and .I pc . Eventually someone will write Fortran and Pascal front-ends for gcc; this has already been done for C++. So far nobody has created a GNU \fIlint\fP, but it is an obvious project. .PP CSRG has ported Berkeley .UX to the Tahoe, a fast Vax-like machine built by Computer Consoles and resold by Harris and others. We are looking for someone to do a Tahoe port of gcc, to replace the PCC supplied by CCI. .SH Problems in Building .UX .PP .UX compilers traditionally look in certain global places in the file system for their libraries, include files, etc. This is a problem when cross-compiling, or when building a new .UX release (which almost amounts to the same thing). While it is possible to provide a new default directory for .B #include files, if a source program .B #include s a file that is not in the cross-compilation include files, the C compiler will erroneously use the one from /usr/include. There should be a switch that turns off \fIall\fP the built-in include file and library pathnames, and only uses those specified on the compiler's command line. .PP However, there is still the problem of getting those switches to the compiler's command line. .I Make is a great tool for dealing with one directory's worth of files, but as .UX has evolved, \fImake\fP has not kept up. Indeed, it has fallen behind; Makefiles that worked perfectly well five years ago will no longer work because each manufacturer (AT&T especially) has hacked up their .I make to include harmful, gratuitous, and mutually incompatible changes. The result is that a Makefile that works on your system is unlikely to work on your neighbor's system, unless they are from the same manufacturer, and you happen to use the same login shell. .PP .I Make works poorly on nested directory structures, too. As an example, we could find no way to change ``cc'' to ``gcc'' in all the Makefiles used to build Berkeley .UX (short of text-editing them all). In a single directory, you can say .I "make CC=gcc" , but this change is not propagated to subdirectories. You can manually propagate that change one level by saying .I "make CC=gcc MFLAGS='CC=gcc'" but that only goes one level (at least in Berkeley's version of .I make ). We ended up putting a copy of gcc in a private .I bin directory, named .I cc , and putting that directory on the front of the search path. (When we later wanted to override CFLAGS as well, \fI~/bin/cc\fP became a shell script that invokes .I "gcc -W" ). .PP Another problem with .I make is that even if it was instructed to ignore errors (with -i or -k), it exits if it can't locate a file that something else depends upon. This has the effect of ``pruning'' a potentially large section of the source hierarchy, and the only warning is an unobtrusive message buried among 500K of other output. .PP Of course, if someone was to fix these bugs in \fImake\fP, they would be creating yet another incompatible version. I have been watching the papers on the ``new makes'' and so far there doesn't seem to be one that handles deeply nested source trees in a clean and consistent fashion, or is otherwise so much better than \fImake\fP that it's worth the effort to switch. I think it is time to look for a completely new paradigm for software compilation control. I don't have any major insights on where to go from here, but it is clear to me that .I make and its derivatives have reached their useful limits. .SH Availability .PP These changes will be available to recipients of Berkeley's next software distribution, whenever that is. We will also make diffs available to others involved in porting .UX to ANSI C. We suspect that most of the problems we solved have already been handled in one or another .UX port, but the work had to be duplicated because either it was not sent back to Berkeley or AT&T, or the changes were not accepted. (AT&T has a history of pretending that .UX bugs do not exist, and Berkeley has limited manpower). .SH Future Work .PP Future projects include building a complete set of ANSI C and POSIX compatible include files and libraries (including function prototypes), and converting the existing sources to use them. An eventual goal is to produce a fully standard-conforming .UX system \(em not only in the interface provided to users, but with sources which will compile and run on any standard-conforming compiler and libraries. .PP The success of this collaboration between GNU and CSRG has encouraged further cooperation. Both parties feel that AT&T licensing is a problem; most recipients of CSRG releases have old .UX licenses, and are unwilling to upgrade to more expensive and more onerous AT&T licenses. However, new AT&T releases include some features which would be useful in Berkeley .UX . The GNU project is working to provide early reimplementations of these features, such as improved shells and ``make'' commands. In return, CSRG is working to release software to the public which has previously been held to be `` .UX licensed'' even though it was not derived from AT&T code, such as the implementation of TCP/IP, and many of the Berkeley utility programs. .SH References .LP \fIDraft Proposed American National Standard \(em Programming Language C\fP, ANSI X3.J11, draft of October 1, 1986 (update for new draft when out). CBEMA, 311 First Street NW #1500, Washington DC 20001. .LP \fI4.3BSD Manual Set\fP, Computer Systems Research Group, University of California at Berkeley. .LP Fowler, Glenn S., ``The Fourth Generation Make'', Usenix conference proceedings, Summer 1985, page 159. (More references on ``make'' are provided in this paper.) .LP Hume, Andrew, ``Mk: a successor to make'', Usenix conference proceedings, Summer 1987, page 445. .LP Kernighan, Brian W. and Ritchie, Dennis M., ``\fIThe C Programming Language\fP'', Prentice-Hall, 1978. From imp at bsdimp.com Fri May 22 15:59:51 2020 From: imp at bsdimp.com (Warner Losh) Date: Thu, 21 May 2020 23:59:51 -0600 Subject: [TUHS] BBN technical reports? Message-ID: I went looking for the three IPC reports: For more information about this system, see: - "Interprocess Communication Extensions for the UNIX Operating System: I - Design Considerations", Rand Corporation, Report R-2064/1-AF, June 1977. - "Interprocess Communication Extensions for the UNIX Operating System: II - Implementation", Rand Corporation, Report R-2064/2-PR, April 1977. - "UNIX TCP User's Guide", Bolt Beranek and Newman Inc., Report No. 3724 And could only find the first one online at https://apps.dtic.mil/dtic/tr/fulltext/u2/a044200.pdf Do we have the other two anywhere? Warner -------------- next part -------------- An HTML attachment was scrubbed... URL: From arnold at skeeve.com Fri May 22 17:42:03 2020 From: arnold at skeeve.com (arnold at skeeve.com) Date: Fri, 22 May 2020 01:42:03 -0600 Subject: [TUHS] History of popularity of C In-Reply-To: References: <8a2e9b1b-8890-a783-5b53-c8480c070f2e@telegraphics.com.au> Message-ID: <202005220742.04M7g3Fw006220@freefriends.org> Richard Salz wrote: > Was the fact that gcc had the "portable" RTL as an intermediate > representation important? That it was designed to be ported. I think it was. GCC had *two* intermediate forms, one representing the source program (trees), and the other representing instructions (RTL). It was really designed to make it easy to write both new front ends and new back ends. In that it seems to have succeeded fairly well, too. :-) Arnold From davida at pobox.com Fri May 22 18:28:52 2020 From: davida at pobox.com (David Arnold) Date: Fri, 22 May 2020 18:28:52 +1000 Subject: [TUHS] History of popularity of C In-Reply-To: <202005211735.04LHZoUr006011@freefriends.org> References: <202005211735.04LHZoUr006011@freefriends.org> Message-ID: <1A23A03A-4B86-487D-AA00-7AB2E8C5D500@pobox.com> On 22 May 2020, at 03:37, arnold at skeeve.com wrote: <...> > C++ became the language of choice on the PC when MSFT started pushing > its compiler and Visual Studio IDE. On the PC side, TurboPascal started to get displaced by Borland C++ I think in the early 90’s. I don’t have a good feeling why, but perhaps it was the parallel evolution of Microsoft’s C & C++, which were doing pretty well even before 1997 when Visual Studio began its rise. Watcom C++ was also around, iirc it was available for OS/2 as well? On the Unix side, the egcs fork of gcc pushed it forward a lot and the subsequent reverse takeover of gcc saved it from needing replacement far earlier. Of course the commercial Unix vendors charging for their compilers helped gcc too, and by then Pascal, Modula/2/3, Ada ... everything else had become a niche market. I don’t recall any hard data from back then though, sorry ... d From tih at hamartun.priv.no Fri May 22 18:52:56 2020 From: tih at hamartun.priv.no (Tom Ivar Helbekkmo) Date: Fri, 22 May 2020 10:52:56 +0200 Subject: [TUHS] History of popularity of C In-Reply-To: <20200521182817.08C0318C093@mercury.lcs.mit.edu> (Noel Chiappa's message of "Thu, 21 May 2020 14:28:17 -0400 (EDT)") References: <20200521182817.08C0318C093@mercury.lcs.mit.edu> Message-ID: Noel Chiappa writes: > I suspect the real reason for C's sucess was the nature of the language. > When I first saw it (ca. 1976), it struck me as a quantum improvement over > its contemporaries. Paul Graham expressed it like this: "It seems to me that there have been two really clean, consistent models of programming so far: the C model and the Lisp model. These two seem points of high ground, with swampy lowlands between them." -tih -- Most people who graduate with CS degrees don't understand the significance of Lisp. Lisp is the most important idea in computer science. --Alan Kay From coppero1237 at gmail.com Fri May 22 19:51:43 2020 From: coppero1237 at gmail.com (Tyler Adams) Date: Fri, 22 May 2020 12:51:43 +0300 Subject: [TUHS] History of popularity of C In-Reply-To: References: <20200521182817.08C0318C093@mercury.lcs.mit.edu> Message-ID: Awesome, looks like my theory was completely wrong. Here's what it looks like to me, please correct me as needed. C's popularity has 2 distinct phases. 1972-1987 Unix drove C. Writing a functional PCC for a particular architecture was easy, but not unusually so compared to other languages at the time. 1987- gcc made C uniquely free to compile, so people chose to write C because it was free and already popular. Perl also came out in 1987, and afaik that was always free, but C still took off because there was so much room for multiple languages. So, now Im curious about embedded systems. In my limited experience, every "embedded system" I programmed for from 2002-2011 had C as its primary language. After 2011, I stopped programming embedded systems, so I don't know after that. Why was C so dominant in this space? Is it because adding a backend to gcc was free, C was already well known, and C was sufficiently performant? Tyler On Fri, May 22, 2020, 11:53 Tom Ivar Helbekkmo wrote: > Noel Chiappa writes: > > > I suspect the real reason for C's sucess was the nature of the language. > > When I first saw it (ca. 1976), it struck me as a quantum improvement > over > > its contemporaries. > > Paul Graham expressed it like this: > > "It seems to me that there have been two really clean, consistent > models of programming so far: the C model and the Lisp model. These > two seem points of high ground, with swampy lowlands between them." > > -tih > -- > Most people who graduate with CS degrees don't understand the significance > of Lisp. Lisp is the most important idea in computer science. --Alan Kay > -------------- next part -------------- An HTML attachment was scrubbed... URL: From arnold at skeeve.com Fri May 22 21:09:02 2020 From: arnold at skeeve.com (arnold at skeeve.com) Date: Fri, 22 May 2020 05:09:02 -0600 Subject: [TUHS] History of popularity of C In-Reply-To: References: <20200521182817.08C0318C093@mercury.lcs.mit.edu> Message-ID: <202005221109.04MB92D3016090@freefriends.org> Tyler Adams wrote: > So, now Im curious about embedded systems. In my limited experience, every > "embedded system" I programmed for from 2002-2011 had C as its primary > language. After 2011, I stopped programming embedded systems, so I don't > know after that. Why was C so dominant in this space? First of all, because C is the (almost) perfect language for embedded systems - tight code generated, language close to the metal, etc. etc. > Is it because adding > a backend to gcc was free, C was already well known, and C was sufficiently > performant? Cygnus Solutions (Hi John!) had a lot to do with this. They specialized in porting GCC to different processors used in embedded systems and provided support. Arnold From coppero1237 at gmail.com Fri May 22 21:15:18 2020 From: coppero1237 at gmail.com (Tyler Adams) Date: Fri, 22 May 2020 14:15:18 +0300 Subject: [TUHS] History of popularity of C In-Reply-To: <202005221109.04MB92D3016090@freefriends.org> References: <20200521182817.08C0318C093@mercury.lcs.mit.edu> <202005221109.04MB92D3016090@freefriends.org> Message-ID: Doesn't C++ also generate tight code and is fairly close to the metal? Today C++ is the high performant language for game developers and HFT shops. But, I never found it on any of these embedded systems, it was straight C. Tyler On Fri, May 22, 2020, 14:09 wrote: > Tyler Adams wrote: > > > So, now Im curious about embedded systems. In my limited experience, > every > > "embedded system" I programmed for from 2002-2011 had C as its primary > > language. After 2011, I stopped programming embedded systems, so I don't > > know after that. Why was C so dominant in this space? > > First of all, because C is the (almost) perfect language for embedded > systems - tight code generated, language close to the metal, etc. etc. > > > Is it because adding > > a backend to gcc was free, C was already well known, and C was > sufficiently > > performant? > > Cygnus Solutions (Hi John!) had a lot to do with this. They specialized > in porting GCC to different processors used in embedded systems and > provided support. > > Arnold > -------------- next part -------------- An HTML attachment was scrubbed... URL: From a.phillip.garcia at gmail.com Fri May 22 21:58:26 2020 From: a.phillip.garcia at gmail.com (A. P. Garcia) Date: Fri, 22 May 2020 07:58:26 -0400 Subject: [TUHS] History of popularity of C In-Reply-To: References: <20200521182817.08C0318C093@mercury.lcs.mit.edu> Message-ID: On Fri, May 22, 2020, 5:52 AM Tyler Adams wrote: > So, now Im curious about embedded systems. In my limited experience, every > "embedded system" I programmed for from 2002-2011 had C as its primary > language. After 2011, I stopped programming embedded systems, so I don't > know after that. Why was C so dominant in this space? Is it because adding > a backend to gcc was free, C was already well known, and C was sufficiently > performant? > I don't know how much gcc contributed to the success of C in the embedded space. Microcontrollers are often programmed in assembly. They have memory and speed constraints, much like the PDPs where C began. I think it goes back to what Larry said about C being so close to the metal. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From lm at mcvoy.com Sat May 23 00:11:30 2020 From: lm at mcvoy.com (Larry McVoy) Date: Fri, 22 May 2020 07:11:30 -0700 Subject: [TUHS] History of popularity of C In-Reply-To: <5106.1590120606@hop.toad.com> References: <8a2e9b1b-8890-a783-5b53-c8480c070f2e@telegraphics.com.au> <5106.1590120606@hop.toad.com> Message-ID: <20200522141130.GF12554@mcvoy.com> On Thu, May 21, 2020 at 09:10:06PM -0700, John Gilmore wrote: > Note the quaint footnoted homage to distributed collaboration, which was > still novel back then in the pre-Covid, pre-public-Internet, 2400 baud > modem era. > > John http://mcvoy.com/lm/papers/porting-berkeley.pdf for those who don't want to run it through groff. As an aside, this didn't work (firefox couldn't display it): groff -ms -Tpdf porting-berkeley.ms > porting-berkeley.pdf but this did: groff -ms porting-berkeley.ms > PS ps2pdf PS porting-berkeley.pdf I'll ask the groff people if they know what is up. From lm at mcvoy.com Sat May 23 00:17:49 2020 From: lm at mcvoy.com (Larry McVoy) Date: Fri, 22 May 2020 07:17:49 -0700 Subject: [TUHS] History of popularity of C In-Reply-To: <5106.1590120606@hop.toad.com> References: <8a2e9b1b-8890-a783-5b53-c8480c070f2e@telegraphics.com.au> <5106.1590120606@hop.toad.com> Message-ID: <20200522141749.GH12554@mcvoy.com> Clem, you should read that paper, link again: http://mcvoy.com/lm/papers/porting-berkeley.pdf because it validates a lot of what I have said about not having access to the AT&T code. The BSD code was slightly easier to get but even that, around 1985 at UW-Madison, was locked up on an 11/750 named slovax. I had to beg and beg to get a login on that machine. You had to be somebody to get access to the source and I was still nobody. I did get a login eventually, I think I had to sign some papers, don't remember. I went on to spend so many happy hours reading the sources that my primary machine, be it 68k, SPARC, MIPS, x86, whatever, has been called slovax ever since. On Thu, May 21, 2020 at 09:10:06PM -0700, John Gilmore wrote: > Richard Salz wrote: > > And what about John Gilmore making all bsd user it? And the multiple usenix > > tutorials? > > I think Rich is referring to the time in 1987-8 when I spent some time > compiling the entire BSD distribution sources with the Vax version of > gcc. This was a volunteer effort on my part so that Berkeley could > adopt GCC to replace PCC. They got an ANSI C compiler, and avoided AT&T > copyright restrictions on Yet Another critical piece of Berkeley Unix. > GNU got an extensive test of GCC which moved it out of "beta" status. > > I ended up taking extensive notes, and wrote a 1988 paper about the > experience, which I submitted to USENIX. But it was rejected, on the > theory that porting code (even ancient crufty Unix code) through new > compilers wasn't research. Indeed, I recall Kirk McKusick remarking to > me around that time that even Unix kernel ports to new architectures > were so routine as to not be research in his opinion. > > Oddly, I was easily able to find that paper (thanks to Kryder's Law), so > I have appended it verbatim below (in troff with -ms macros). In short, > I found about a dozen bugs in GCC, which RMS fixed; and many hundreds of > bugs in the 4.3BSD Unix sources, which I fixed and Keith merged upstream. > > Note the quaint footnoted homage to distributed collaboration, which was > still novel back then in the pre-Covid, pre-public-Internet, 2400 baud > modem era. > > John > > .TL > Porting Berkeley > .UX > through the GNU C Compiler > .AU > John Gilmore > .AI > Grasshopper Group > San Francisco, CA, USA 94117 > gnu at toad.com > .AB > We have ported UC Berkeley's latest > .UX > sources through the GNU C Compiler, > a free draft-ANSI compatible compiler written by Richard Stallman and available from the Free > Software Foundation. In the process, we made Berkeley > .UX > more compatible > with the draft ANSI C standard, and tested the GNU C Compiler > for its full production release. > We describe the impact of various ANSI C changes on the Berkeley > .UX > sources, the kinds of non-portable code that the conversion uncovered, > and how we fixed them. We also briefly explore some limitations in the tools > used to build a > .UX > system. > .AE > .SH > Introduction > .PP > The GNU C Compiler (GCC) is a complete C compiler, compatible with the draft > ANSI standard, and > available in source from the Free Software Foundation (FSF). It was written by > Richard Stallman > in 1986 and 1987, and is (at this writing) in its > 18th release. It is a major component of the GNU (``GNU's Not > .UX '') > project, whose aim > is to build a complete > .UX -like > software system, > available in source to anyone who wants it. > The compiler produces good code \(em better than most commercial > compilers \(em and has been ported to the Vax, MC680X0, > and NS32XXX. > .PP > Berkeley > .UX , > from the Computer Systems Research Group (CSRG) at the University > of California at Berkeley, > had its start in the 1970's with a prerelease > .UX > Version 7, and > has been improving ever since. The current sources derive from the > 1978 AT&T ``32V'' release, a V7 variant for the Vax. CSRG has produced > four major releases for the Vax > \(em 3, 4.1, 4.2, and 4.3BSD. These releases have set the > standard for high powered > .UX > systems for many years, and continue > to offer an improved alternative to the flat-tasting AT&T > .UX > releases. > .PP > However, Berkeley's C compiler is based on an old version of PCC, > the Portable C Compiler from AT&T. There was little chance that anyone > would provide ANSI C language extensions in this compiler, or do significant > work on optimizing the generated code. By merging the GNU C compiler > into the Berkeley release, we provided these new features to Berkeley > Unix users at a low cost, > while offering the GNU project an important test case for GNU C. > .SH > Goals > .PP > The major goal for the project is to move GCC out of ``beta test'' and > into ``production'' status, > by demonstrating that a successful > .UX > port can be based on it. > .PP > We are also providing a better maintained > compiler for Berkeley > .UX . > GCC already produces better > object code then the previous compiler, > has a more modern internal structure, and supports useful features > such as function prototype declarations. > It is also maintained by a large collection of people around the world, > who contribute their fixes and enhancements to the master sources. > Regular releases by the > Free Software Foundation encourage distribution of the improvements. > In contrast, PCC > is proprietary to AT&T, and few fixes are widely distributed, except as > part of infrequent and expensive AT&T releases. > .PP > We are producing a > .UX > source tree which can be compiled > by > .I both > the old and the new compilers. This is partly for convenience during the port, > partly in case the project suffers long delays, > and partly because Berkeley > .UX > also runs on the Tahoe, a fast Vax-like machine > built by Computer Consoles, which > GCC does not yet support. > We are avoiding the introduction of new > .B #ifdef 's, > instead rewriting the code so that it does not depend > on the features of either compiler. > .PP > We have to constantly remind ourselves to minimize the changes required. > It's too easy to get lost in a maze of twisty > .UX > code, all desperately > needing improvement. > .PP > Whenever we have to make a change, we have moved in the direction of > ANSI > C and POSIX compatability. > .SH > People > .PP > The project was conceived by John Gilmore, and endorsed > by Keith Bostic and Mike Karels of CSRG, and Richard Stallman of FSF. > John did the major grunt work and provided fixes to the > .UX > code. > Keith and Mike provided machine > resources, collaborated > on major decisions, and arbitrated the style and content of the changes > to > .UX . > Richard provided quick turnaround on compiler bug fixes and problem > solving. > This setup worked extremely well. > .PP > We started work on 17 December 1987, and are not yet done at the > time of writing (19 February 1988). About 9 days of my time, 2 of Keith's, > half a day of Mike's, and XXX days of Richard's have gone into the > project so far. > .SH > Working Style > .PP > Most of the work was done over networks, in a loosely coordinated > style which was hard to concieve of only a few years ago.\(dg > .FS \(dg > Much of the free software work that is happening these days occurs in this > manner, and I would like to publicly thank the original DARPA pioneers who gave > birth to this vision of wide area, computer mediated collaborative work. > .FE > John worked in San Francisco, > Keith in Berkeley, and Richard in Cambridge. Keith set up an account and > a copy of the source tree on > .I vangogh , > a Vax 8600 at Berkeley. > John spent a few > days in front of a Sun at Berkeley getting things straight, but did > most of the work by dialing in at 2400 baud from his office in San Francisco. > When we modified > .UX > source files, Keith > checked the changes and merged them back into the master > .UX > sources on another machine at Berkeley. When we found an apparent > bug in GCC, we isolated a small > excerpt or test program to demonstrate the bug, and forwarded it to Richard by Internet electronic > mail. > Bug fixes came back as new GCC releases, which were FTP'd over the Internet > from MIT. Ongoing status reports, discussions, and scheduling were done > by \fIuucp\fP and Internet electronic mail. > .PP > At this writing, we have used four GCC releases (1.15 through 1.18). > For each > GCC release, we did a ``pass'' over the > .UX > source tree; > one such pass included an updated source tree as well. > Each GCC > release was built, tested, and installed on > .I vangogh > without trouble. > Then we ran > .I "make clean; make" > on the source tree, and examined 500K to 800K of resulting > output. Keith Bostic's Makefiles did an excellent job of > automating this process, though we ran into some problems with the > .UX > compilation model in general, and limitations in > .I make > in particular. > .SH > ANSI Language Changes > .PP > The problems encountered during the port fell into two general categories. > Some of the code was not written portably and failed in the new environment. > Other code was written portably for its time, but failed because ANSI C > has redefined parts of the language. In some cases it was hard to tell > the difference; the consensus on what is ``portable code'' changes over > time, and on some points there is no agreement. > .PP > The major ANSI C problem was the generation of > .B "character constants in cpp" . > The traditional > .UX > C preprocessor (\fIcpp\fP), written by John F. Reiser, would > substitute a macro's parameters into like-named substrings even inside > single or double quotes in the macro definition. For example: > .DS > #define CTRL(c) ('c'&037) > #define CEOF CTRL(d) > .DE > In an attempt to make things easier for tokenizing preprocessors, > ANSI C has changed the > rules here, and there is in fact > .I no > way to generate a character constant containing a macro argument. > (There is a way to generate a character > .I string , > e.g. double-quoted string, but not a single-quoted character. > We consider this a bug in ANSI C.) > Fixing this required altering both the macro definition and each reference > to the macro: > .DS > #define CTRL(c) (c&037) > #define CEOF CTRL('d') > .DE > This required changes in about 10 system include files and in about 45 > source modules. Many user programs turned out to depend on the undocumented > .B CTRL > macro, defined in > .B , > and since all its callers had to change, all those programs did too. > .PP > Another \fIcpp\fP problem involved > .B "token concatenation" . > No formal facilities were provided for this in the old \fIcpp\fP, but many > users discovered that with code like this, from the /etc/passwd scanning code: > .DS > #define EXPAND(e) passwd.pw_/**/e = tp; while (*tp++ = *cp++); > EXPAND(name); > EXPAND(passwd); > .DE > they could cause a macro argument to be concatenated with another argument, > or with preexisting text, to make a single name. In one case > (\fIphantasia\fP), > the Makefile provided half of a quoted string as a command line > .B #define , > and the source text provided the other half! > ANSI C > does not allow a preprocessor to concatenate tokens in these ways, instead > providing a newly invented > .B ## > operator, and new rules requiring the compiler to concatenate adjacent > character strings. Again, > it was impossible to write > a macro that works with both old and new compilers, and we didn't want > to uglify our code with > .B "#ifdef __STDC__" ; > our solution was to > rewrite both the macros and all their callers, to avoid ever having to > concatenate tokens: > .DS > #define EXPAND(e) passwd.e = tp; while (*tp++ = *cp++); > EXPAND(pw_name); > EXPAND(pw_passwd); > .DE > Mostly the token concatenation was used as a typing convenience, so this > was not a problem. It involved changes to five modules. > We found no clean solution for > .I phantasia ; > a fix will probably involve rewriting it to do explicit > string concatenations at runtime. > .PP > Changes to the > .B "scope of externals" > provided another set of widely scattered changes. If an external > identifier is declared from inside a function, PCC causes that declaration > to be visible to the entire remaining text of the source file. > This also applies to functions which are implicitly declared > when they first appear in an expression. > This > behaviour was not explicitly sanctioned by K&R, > but it was condoned (pg. 206, 2nd paragraph), and many programs depended on it. > ANSI C changed the scope rules to be more consistent; if you declare an > external identifier in a local block, the declaration has no effect outside > the block. We moved extern declarations to global scope, or added global > function declarations, in 38 files to handle this. > .PP > A number of programs used > .B "new keywords" > such as \fIsigned\fP or \fIconst\fP as identifiers. We renamed the identifiers > in 9 modules. > .PP > The Fortran libraries used a \fBtypedef name as a formal parameter\fP > to a set of functions. ANSI C has disallowed this, since it complicates > the parsing of the new prototype-style function declarations. We renamed > the parameter in 8 modules. > .PP > Three modules used a \fBtypedef with modifiers\fP, e.g.: > .DS > typedef int CONSZ; > x = (unsigned CONSZ) y; > .DE > This has been repudiated by ANSI C. We fixed it by making the original > typedef \fBunsigned\fP where possible, or by > creating a second typedef for ``U_CONSZ''. > .SH > Non-Portable Constructs > .PP > The worst non-portable construct we found in the > .UX > sources was the use of > .B "pointers to non-members" . > There was plenty of code as bad as: > .DS > int *foo; > foo->memb = 5 > if (foo->humbug >= -1) bah(); > .DE > and, in many cases, \fImemb\fP and \fIhumbug\fP are not even members of > the same struct! > Such code seems to have been written with a ``BCPL'' mentality, assuming > that all pointers are really the same thing and it doesn't matter what their > type is. Early C implementations lacked the > .B union > declarator, > and did not distinguish between the members of different structures. > Exploiting this has been considered > bad practice for years, and lint checks for it, > though many > .UX > compilers do not. We found a lot of it in old code, though newer > code did not lack for examples either. > Fixing this problem caused the most work, > because we had to figure out what each untyped or mistyped pointer was > .I really > being used for, then fix its type, and whatever references to it were > inconsistent with that type. We changed 5 modules due to this. > One program, \fIefl\fP, would have required so much work > that we abandoned it, since we could > not find anyone using it. > .PP > Another problem was caused by existing uses of > .B "cpp on non-C sources" . > Various assembler language modules were being preprocessed by \fIcpp\fP, > probably > because there is no standard macro assembler for > .UX . > These modules are > carefully arranged to avoid confusing the old \fIcpp\fP; for example, > assembler language comments are introduced by > .B # , > but indented so that \fIcpp\fP will not treat them as control lines. > ANSI \fIcpp\fP's handle white space on both sides of the ``#'', so > indentation no longer hides these comments. Also, the ANSI rules > to require the preprocessor to keep track of which > material is inside single and double quotes and which is outside; > the old \fIcpp\fP terminated a character string or constant at the next > unescaped newline. Vax assembler language uses unmatched quotes > when specifying single ASCII characters, such as in immediate operands. > This causes an ANSI \fIcpp\fP to stop processing # directives at that point, > until it finds another > unmatched quote. We chose to alter the assembler modules to avoid > stumbling over these features in ANSI C preprocessors, without fixing the > larger problem of using a C-specific preprocessor on non-C text. > .PP > In addition to embedded C preprocessor statements in assembler > sources, we had to deal with > .B "asm() constructs" > in C source. Some system-dependent routines were written in C > with intermixed assembler code, producing a mess when compiled with > anything but the original compiler. Other routines, such as > .I compress , > drop in an > .B asm() > here or there as an optimization. Still more modules, including the kernel, > run a > .I sed > script over the assembler code generated by the C compiler, before > assembling and linking it. There is no general solution to these > problems. GCC has added an asm() facility that is independent of > the compiler's register allocation strategy, but programs using this are > incompatible with the old C compiler. > We are investigating > a possible fix involving > changing all these places to use e.g. > .B "#include " > which, in GCC, would define inline code containing asm()s, while > in PCC, declarations of (slower) external functions would be generated. > .PP > .I Troff > used > .B "multi-character constants" > in its font tables; we fixed it with a macro for building an int out of two > characters. A Fortran library module used the character constant > .B 'EOF' , > presumably a typo for > .B EOF ; > and \fIrogue\fP defined the character '\300' as a possible command letter. > While ANSI C permits multiple character constants, they are implementation > defined, and GCC wisely defines them to be invalid (as the standard should > have done). > .PP > Some programs tried to declare functions or variables, > .B "omitting both type and storage class" . > This usage is not even valid in K&R, though PCC accepts it. We fixed this in > about 15 > modules, by adding ``int'' to the declarations. There were two other modules > where this check uncovered inadvertent use of ``;'' in a declaration list > where ``,'' was intended. > .PP > GCC provides better error checking in a few ways, and caught a number > of bugs caused by misunderstood > .B "sign extension" . > It warns ``comparison is always 0 due to limited range of data type'' > for constructs like: > .DS > char c; > if (c == 0x80) foo(); > .DE > If a signed character contains the bit pattern 0x80, using it in an > expression causes it to be > sign-extended to 0xFFFFFF80, which does not equal 0x00000080. > Bugs of this sort were fixed, typically by casting the 0x80 to (char), > in 5 modules. > .PP > Changes to the rules for \fBparsing declarations\fP made us fix two modules > where the last declaration in a struct was immediately followed by a > closing brace, without a semicolon. Three more modules needed changes > because the rules for where braces are required in struct or array > initializers have changed. Four programs defined a \fBstruct foo\fP > and then referenced it as a \fBunion foo\fP, or vice verse. Two programs > declared \fBregister struct foo bar;\fP and then took bar's address, which > is not allowed for register variables! > .PP > Thirteen programs had miscellaneous \fBpointer usage bugs\fP > fixed. Two more were > comparing pointers to \fB-1\fP; these were changed to use zero as a > flag value instead. > .PP > In ANSI C, local variables in use at a > .B setjmp() > are no longer guaranteed to be preserved when a > .B longjmp() > occurs, unless they are declared \fBvolatile\fP. This > is not a problem for the Vax port, since the Vax longjmp() > will continue to restore the registers, but gcc warns about this > situation, since code that assumes restoration is not portable. > We have not yet worked on fixes for this. > .PP > Five or ten other miscellaneous bugs were caught and fixed. > .SH > Least portable > .UX > code > .PP > The process of porting software inevitably uncovers > a few files that cause a disproportionate share of problems. > For our port, > the clear winner is > .I efl , > the Extended Fortran Language, by Stu Feldman. > It defines ``\fBtypedef int * ptr;\fP'' in a header file, > and then uses a ``ptr'' to point to anything. > GCC produced > 1600 lines of errors messages on this program alone, and three modules > of it caused compiler core dumps. We ended > up deciding to abandon support for it rather than attempt to clean > it up. > .PP > A runner-up is > .I pcc , > the Portable C Compiler itself, by Steven C. Johnson. > It caused GCC to coredump twice, tickled another GCC parsing bug, > and contained the modified typedef and sign extension problems mentioned above. > .PP > Third place goes to > .I monop , > the Monopoly\(dg > .FS \(dg > Trademark of Parker Brothers > .FE > game, by Ken Arnold. This > program used a variety of typed pointers, but the main pointer to > a set of structs was declared as a \fBchar *\fP. Another part of > the code initialized an array of struct pointers with integer values, > then a small loop at the beginning of the game would read out these > integers and replace them with corresponding ``real'' struct pointers. > It took about two days to face up to the job and about a day to clean > it up. > .PP > Honorable mention for silly mistakes goes to the > .I indent > program, by someone at the University of Illinois. > It contain the only instance of > .B "a + = b" > (with a space between + and =), and was the only module > to terminate its > .B #include > directives with a semicolon. > It also contained a comparison between a character and the value 0200, > a value that a signed 8-bit char can never hold. > .SH > Results > .PP > We are pleased with the results so far. Most of the > .UX > code compiled > without problems, and the parts which we have executed are free from > code generation bugs. > The worst of the ANSI C changes only required roughly fifty modules > to be changed, and there were only two problems of this magnitude. > A total of > twenty bugs in gcc were located so far, and most of them are now fixed. > We expected several times this many bugs; the compiler is in better > shape than any of us expected. > .PP > Many minor type problems and ``nit'' incompatabilities with ANSI C have > been removed from the > .UX > sources. > .SH > Future Results > .PP > \fI(This section will move to \fBResults\fP for the final paper.)\fP > .PP > We expect that the size of the > .UX > binaries will be significantly less than > with the previous compiler, but at the current stage of the project > we can't easily confirm the expectation. > .PP > When the system compiled with GCC is in everyday use at Berkeley, GCC > will be relabeled as a full production-quality compiler, which will > encourage its wider use. > .SH > Non-Results > .PP > We have not attempted to make Berkeley > .UX > fully ANSI C compliant. > In particular, we have retained preprocessor comments (#endif FOO) > as well as machine-specific \fB#define\fP's (#ifdef vax). GCC supports > these features without trouble, even though ANSI C does not. > .PP > The > .UX > kernel has not yet been ported to gcc. Other people are working on > this, compiling one module at a time and running it for a while before > moving on to the next. We will merge their work with > ours once we have the rest of the system in a stable state. > .PP > Pieces of the Portable C Compiler are still being used inside > .I "lint, f77" , > and > .I pc . > Eventually someone will write Fortran and Pascal front-ends for gcc; > this has already been done for C++. So far nobody has created a GNU > \fIlint\fP, but it is an obvious project. > .PP > CSRG has ported Berkeley > .UX > to the Tahoe, a fast Vax-like machine > built by Computer Consoles and resold by Harris and others. We are looking > for someone to do a Tahoe port of gcc, to replace the PCC supplied by CCI. > .SH > Problems in Building > .UX > .PP > .UX > compilers traditionally look in certain global places in the > file system for their libraries, include files, etc. This is a problem > when cross-compiling, or when building a new > .UX > release (which almost > amounts to the same thing). While it is possible to provide a new > default directory for > .B #include > files, if a source program > .B #include s > a file that is not in the cross-compilation include files, > the C compiler will erroneously use the one from /usr/include. > There should be a switch that turns off \fIall\fP the built-in include > file and library pathnames, and only uses those specified on the > compiler's command line. > .PP > However, there is still the problem of getting those switches to the > compiler's command line. > .I Make > is a great tool for dealing with one directory's worth of files, > but as > .UX > has evolved, \fImake\fP has not kept up. Indeed, it has fallen behind; > Makefiles that worked perfectly well five years ago will no longer > work because each manufacturer (AT&T especially) has hacked up their > .I make > to include harmful, gratuitous, and mutually incompatible changes. > The result is that a Makefile that works on your system is unlikely > to work on your neighbor's system, unless they are from the same manufacturer, > and you happen to use the same login shell. > .PP > .I Make > works poorly on nested directory structures, too. > As an example, we could find no way to change ``cc'' to ``gcc'' in all the > Makefiles used to build Berkeley > .UX > (short of text-editing them all). > In a single directory, you can say > .I "make CC=gcc" , > but this change is not propagated to subdirectories. You can manually > propagate that change one level by saying > .I "make CC=gcc MFLAGS='CC=gcc'" > but that only goes one level (at least in Berkeley's version of > .I make ). > We ended up putting a copy of gcc in a private > .I bin > directory, named > .I cc , > and putting that directory on the front of the search path. > (When we later wanted to override CFLAGS as well, \fI~/bin/cc\fP > became a shell script that invokes > .I "gcc -W" ). > .PP > Another problem with > .I make > is that even if it was instructed to ignore errors (with -i or -k), it exits > if it can't locate a file that something else depends upon. This has the > effect of ``pruning'' a potentially large section > of the source hierarchy, and the > only warning is an unobtrusive > message buried among 500K of other output. > .PP > Of course, if someone was to fix these bugs in \fImake\fP, they would > be creating yet another incompatible version. > I have been watching the papers on the ``new makes'' and so far there > doesn't seem to be one that handles deeply nested > source trees in a clean and consistent fashion, or is otherwise > so much better than \fImake\fP that it's worth the effort to switch. > I think it is time to look for a completely new paradigm for > software compilation control. I don't have any major insights on where > to go from here, but it is clear to me that > .I make > and its derivatives have reached their useful limits. > .SH > Availability > .PP > These changes will be available to recipients of Berkeley's next software > distribution, whenever that is. We will also make diffs available > to others involved in porting > .UX > to ANSI C. We suspect that most of the > problems we solved have already been handled in one or another > .UX > port, but the work had to be duplicated because either it was not > sent back to Berkeley or AT&T, or the changes were not accepted. (AT&T > has a history of pretending that > .UX > bugs do not exist, and > Berkeley has limited manpower). > .SH > Future Work > .PP > Future projects include building a complete set of ANSI C and POSIX > compatible include files and libraries (including function prototypes), > and converting the existing sources to use them. An eventual goal > is to produce a fully standard-conforming > .UX > system \(em not only in > the interface provided to users, but with sources which will compile > and run on any standard-conforming compiler and libraries. > .PP > The success of this collaboration between GNU and CSRG has encouraged further > cooperation. Both parties feel that AT&T licensing > is a problem; most recipients of CSRG releases have old > .UX > licenses, > and are unwilling to upgrade to more expensive and more onerous AT&T > licenses. However, new AT&T releases include some features which would > be useful in Berkeley > .UX . > The GNU project is working to provide > early reimplementations of these features, such as improved shells and > ``make'' commands. In return, CSRG is working to release software to > the public which has previously been held to be `` > .UX > licensed'' even though > it was not derived from AT&T code, such as the implementation > of TCP/IP, and many of the Berkeley utility programs. > .SH > References > .LP > \fIDraft Proposed American National Standard \(em Programming Language C\fP, > ANSI X3.J11, draft of October 1, 1986 (update for new draft when out). > CBEMA, 311 First Street NW #1500, Washington DC 20001. > .LP > \fI4.3BSD Manual Set\fP, > Computer Systems Research Group, University of California > at Berkeley. > .LP > Fowler, Glenn S., ``The Fourth Generation Make'', Usenix conference > proceedings, Summer 1985, page 159. (More references on ``make'' > are provided in this paper.) > .LP > Hume, Andrew, ``Mk: a successor to make'', Usenix conference > proceedings, Summer 1987, page 445. > .LP > Kernighan, Brian W. and Ritchie, Dennis M., ``\fIThe > C Programming Language\fP'', Prentice-Hall, 1978. -- --- Larry McVoy lm at mcvoy.com http://www.mcvoy.com/lm From rich.salz at gmail.com Sat May 23 00:34:30 2020 From: rich.salz at gmail.com (Richard Salz) Date: Fri, 22 May 2020 10:34:30 -0400 Subject: [TUHS] History of popularity of C In-Reply-To: <20200522141130.GF12554@mcvoy.com> References: <8a2e9b1b-8890-a783-5b53-c8480c070f2e@telegraphics.com.au> <5106.1590120606@hop.toad.com> <20200522141130.GF12554@mcvoy.com> Message-ID: Great to hear from you John. I remember you handing out flyers during various Usenix meetings about this. :) One of my favorite parts of your paper: "the flat-tasting AT&T releases" ! -------------- next part -------------- An HTML attachment was scrubbed... URL: From pnr at planet.nl Sat May 23 00:43:02 2020 From: pnr at planet.nl (Paul Ruizendaal) Date: Fri, 22 May 2020 16:43:02 +0200 Subject: [TUHS] BBN technical reports? Message-ID: <47C6BA0D-48B8-43F9-8805-614B25F98B14@planet.nl> > I went looking for the three IPC reports: > > For more information about this system, see: > - "Interprocess Communication Extensions for the UNIX Operating System: I - > Design Considerations", Rand Corporation, Report R-2064/1-AF, June 1977. > - "Interprocess Communication Extensions for the UNIX Operating System: II > - Implementation", Rand Corporation, Report R-2064/2-PR, April 1977. > - "UNIX TCP User's Guide", Bolt Beranek and Newman Inc., Report No. 3724 > > > And could only find the first one online at > > https://apps.dtic.mil/dtic/tr/fulltext/u2/a044200.pdf > > Do we have the other two anywhere? Yes we do. The first two are about Rand ports. You’ve found the overview by Carl Sunshine, the implementation by Steve Zucker is the next report up on DTIC: https://apps.dtic.mil/sti/pdfs/ADA044201.pdf The TCP User guide report you refer to is about a TCP implementation done by Jack Haverty, which took an implementation in PDP-11 assembler and wrapped it in a NCP Unix shell. Jack has a listing in his attic, but it is not scanned. The problem with this implementation was that it is was all very cramped in the kernel, so he had to reduce disk buffers substantially. As pipes (and Rand ports) use disk buffers to hold pipe data, which get swapped out to disk as needed, he found that his system trashed a lot and effective TCP speeds in his first trials were barely above few dozen byte per second. The report is actually on the TUHS tree: https://www.tuhs.org/cgi-bin/utree.pl?file=BBN-V6/doc (there is also nroff source for Steven Zucker’s report, and other interesting material in that directory - such as a report on pipe performance). The actual V6 implementation that is on that THUS page is not Jack Haverty’s version, but a version done from scratch by Mike Wingfield, using the learning’s of Jack’s work. Mike’s implementation worked quite well and won first prize in a March 1979 tongue-in-cheek TCP4 conformance & interop competition. The BBN report for that version is 4295. Craig Partridge was kind enough to dig it up from the BBN library - I can send you a copy if you like (it was cleared for public release). Although the implementation was done for research purposes (mainly Autodin II security features and simulation), it seems to have had a long life as a stop-gap to keep aging PDP-11’s connected after the switch from NCP to TCP in 1983. Paul From toby at telegraphics.com.au Sat May 23 00:59:27 2020 From: toby at telegraphics.com.au (Toby Thain) Date: Fri, 22 May 2020 10:59:27 -0400 Subject: [TUHS] History of popularity of C In-Reply-To: <202005221109.04MB92D3016090@freefriends.org> References: <20200521182817.08C0318C093@mercury.lcs.mit.edu> <202005221109.04MB92D3016090@freefriends.org> Message-ID: <09ac14d8-3c83-1755-5326-01befa8b7a48@telegraphics.com.au> On 2020-05-22 7:09 AM, arnold at skeeve.com wrote: > Tyler Adams wrote: > >> So, now Im curious about embedded systems. In my limited experience, every >> "embedded system" I programmed for from 2002-2011 had C as its primary >> language. After 2011, I stopped programming embedded systems, so I don't >> know after that. Why was C so dominant in this space? > > First of all, because C is the (almost) perfect language for embedded > systems - tight code generated, language close to the metal, etc. etc. To my recollection, in 1985 C wasn't firstly considered an embedded language; it was considered an applications language (so was assembly, but we could say that was tapering off). I believe the explosion in popularity was due to that lesson from Unix, that you could have a single portable language for both "system" code and applications code, with a modern looking syntax, that could be self hosted and compiled to reasonably efficient machine code. All those tradeoffs and definitions are very different 40 years later, of course. (And C was far from the first or only language that met those criteria before 1975. It just happened to take off.) > >> Is it because adding >> a backend to gcc was free, C was already well known, and C was sufficiently >> performant? > > Cygnus Solutions (Hi John!) had a lot to do with this. They specialized > in porting GCC to different processors used in embedded systems and > provided support. Having to get a paid consultant doesn't exactly argue for the idea that C compilers were "easy" - plus it's almost a decade after the period of high growth. So this doesn't seem strong support for the thesis quoted by OP. --Toby > > Arnold > From gnu at toad.com Sat May 23 04:40:11 2020 From: gnu at toad.com (John Gilmore) Date: Fri, 22 May 2020 11:40:11 -0700 Subject: [TUHS] History of popularity of C In-Reply-To: References: <20200521182817.08C0318C093@mercury.lcs.mit.edu> <202005221109.04MB92D3016090@freefriends.org> Message-ID: <22671.1590172811@hop.toad.com> Tyler Adams wrote: > Doesn't C++ also generate tight code and is fairly close to the metal? > Today C++ is the high performant language for game developers and HFT shops. > > But, I never found it on any of these embedded systems, it was straight C. My take on this is that programmers who understand the underlying hardware architecture can easily intuit the code that would result from what they write in C. There are only a few late features (e.g. struct parameters, longjmp) that require complex code to be generated, or function calls to occur where no function call was written by the programmer. Whereas in C++, Pascal, Python, APL, etc, a few characters can cause the generated code to do immense amounts of unexpected work. Think of string compares, hash table types, object initializers, or arbitrary amounts of jumping through tables of pointers to different kinds of objects. Automated memory allocation. Garbage collection. This is both a blessing and a curse. In C it was quite predictable how well or badly typical sections of your code would perform. If the performance was bad, it was YOUR fault! But at least YOU could fix it, without learning to hack a compiler instead of your own application. (I once found Berkeley SPICE code doing string compares in a triply nested loop, just to look up the names of the signals. In C. Making changes to a large state machine going into a custom chip was taking the Sun hardware engineers multiple hours per change. I spent weeks finding the source code (Sun's tools group was dysfunctional; I got it from UCB). In half a day of profiling it and fixing it to cache the result of the first string lookup on each signal name, four hour rebuilds went down to under a minute. A second day of profiling and cacheing, just for fun, took it down to 10 seconds.) John From stewart at serissa.com Sat May 23 04:43:40 2020 From: stewart at serissa.com (Lawrence Stewart) Date: Fri, 22 May 2020 14:43:40 -0400 Subject: [TUHS] where did "main" come from? Message-ID: C main programs define “main”. This also seems to be true of B main programs, according to the Johnson/Kernighan manual The 1967 Martin Richards BCPL manual doesn’t explain how programs get started The 1974 update from Martin Richards says there should be an OS addendum that explains this. The 1974 University of Essex BCPL manual says to use START The 1979 Parc Alto BCPL manual uses Main and I think that must be unchanged from 1972. The AMSTRAD BCPL guide from 1986 uses start() So who started “main” and when? I can’t find an online copy of the Bell Laboratories BCPL manual (Canaday/Thompson) from 1969 or anything about how to use BCPL on Multics or CTSS. -L From toby at telegraphics.com.au Sat May 23 05:01:40 2020 From: toby at telegraphics.com.au (Toby Thain) Date: Fri, 22 May 2020 15:01:40 -0400 Subject: [TUHS] History of popularity of C In-Reply-To: <22671.1590172811@hop.toad.com> References: <20200521182817.08C0318C093@mercury.lcs.mit.edu> <202005221109.04MB92D3016090@freefriends.org> <22671.1590172811@hop.toad.com> Message-ID: <866f9bf3-278a-f4cd-dc00-49ccc4defb1f@telegraphics.com.au> On 2020-05-22 2:40 PM, John Gilmore wrote: > Tyler Adams wrote: >> Doesn't C++ also generate tight code and is fairly close to the metal? >> Today C++ is the high performant language for game developers and HFT shops. >> >> But, I never found it on any of these embedded systems, it was straight C. > > My take on this is that programmers who understand the underlying > hardware architecture can easily intuit the code that would result from > what they write in C. There are only a few late features (e.g. struct A short time playing with Godbolt should challenge that view :) https://godbolt.org/ > parameters, longjmp) that require complex code to be generated, or > function calls to occur where no function call was written by the > programmer. > > Whereas ... > > John > From lm at mcvoy.com Sat May 23 05:31:41 2020 From: lm at mcvoy.com (Larry McVoy) Date: Fri, 22 May 2020 12:31:41 -0700 Subject: [TUHS] History of popularity of C In-Reply-To: <22671.1590172811@hop.toad.com> References: <20200521182817.08C0318C093@mercury.lcs.mit.edu> <202005221109.04MB92D3016090@freefriends.org> <22671.1590172811@hop.toad.com> Message-ID: <20200522193141.GH3357@mcvoy.com> On Fri, May 22, 2020 at 11:40:11AM -0700, John Gilmore wrote: > Tyler Adams wrote: > > Doesn't C++ also generate tight code and is fairly close to the metal? > > Today C++ is the high performant language for game developers and HFT shops. > > > > But, I never found it on any of these embedded systems, it was straight C. > > My take on this is that programmers who understand the underlying > hardware architecture can easily intuit the code that would result from > what they write in C. There are only a few late features (e.g. struct > parameters, longjmp) that require complex code to be generated, or > function calls to occur where no function call was written by the > programmer. Amen. > Whereas in C++, Pascal, Python, APL, etc, a few characters can cause the > generated code to do immense amounts of unexpected work. Think of > string compares, hash table types, object initializers, or arbitrary > amounts of jumping through tables of pointers to different kinds of > objects. Automated memory allocation. Garbage collection. Double amen. > This is both a blessing and a curse. In C it was quite predictable how > well or badly typical sections of your code would perform. If the > performance was bad, it was YOUR fault! But at least YOU could fix it, > without learning to hack a compiler instead of your own application. Triple amen. > (I once found Berkeley SPICE code doing string compares in a triply > nested loop, just to look up the names of the signals. In C. Making > changes to a large state machine going into a custom chip was taking the > Sun hardware engineers multiple hours per change. I spent weeks finding > the source code (Sun's tools group was dysfunctional; I got it from > UCB). In half a day of profiling it and fixing it to cache the > result of the first string lookup on each signal name, four hour > rebuilds went down to under a minute. A second day of profiling > and cacheing, just for fun, took it down to 10 seconds.) Gazillion amens (I especially loved the jab at Sun's tools group, I wrote the SCM that Sun used for Solaris initially. They tried to get me to join the tools group to make my stuff "official" - it worked just fine being "unofficial". I took a look at the people in the tools group, no offense, but it was a big step down from working with people like srk and gingell and shannon, not to mention that all of my peers were smart. Tools group, just say no.) From lm at mcvoy.com Sat May 23 05:35:29 2020 From: lm at mcvoy.com (Larry McVoy) Date: Fri, 22 May 2020 12:35:29 -0700 Subject: [TUHS] History of popularity of C In-Reply-To: <866f9bf3-278a-f4cd-dc00-49ccc4defb1f@telegraphics.com.au> References: <20200521182817.08C0318C093@mercury.lcs.mit.edu> <202005221109.04MB92D3016090@freefriends.org> <22671.1590172811@hop.toad.com> <866f9bf3-278a-f4cd-dc00-49ccc4defb1f@telegraphics.com.au> Message-ID: <20200522193529.GI3357@mcvoy.com> On Fri, May 22, 2020 at 03:01:40PM -0400, Toby Thain wrote: > On 2020-05-22 2:40 PM, John Gilmore wrote: > > Tyler Adams wrote: > >> Doesn't C++ also generate tight code and is fairly close to the metal? > >> Today C++ is the high performant language for game developers and HFT shops. > >> > >> But, I never found it on any of these embedded systems, it was straight C. > > > > My take on this is that programmers who understand the underlying > > hardware architecture can easily intuit the code that would result from > > what they write in C. There are only a few late features (e.g. struct > > A short time playing with Godbolt should challenge that view :) > > https://godbolt.org/ > > > > parameters, longjmp) that require complex code to be generated, or > > function calls to occur where no function call was written by the > > programmer. What John didn't mention, he just assumes people know and everyone is the same, is that he is an excellent C programmer, I could fix bugs in his code. You can always fine someone who will make a mess of any language. That's not the point. Assume that you have decent programmers, you will be able to understand and fix their C code. If you have really good C programmers, like my company did, you can start to predict what the bottom half of the function looks like by reading the top half. We wrote very stylized C, were not afraid of gotos when used wisely. From michael at kjorling.se Sat May 23 06:01:49 2020 From: michael at kjorling.se (Michael =?utf-8?B?S2rDtnJsaW5n?=) Date: Fri, 22 May 2020 20:01:49 +0000 Subject: [TUHS] where did "main" come from? In-Reply-To: References: Message-ID: <5507c573-f458-4eec-8d15-fee211a3b76d@localhost> On 22 May 2020 14:43 -0400, from stewart at serissa.com (Lawrence Stewart): > C main programs define “main”. I don't have a ready answer to your question where that name came from, but it's worth remembering (and easy to forget) that main() isn't the actual starting point of execution of a C program. Rather, the starting point is a function within the C library, which does some early setup work and then ultimately calls main() and takes care of passing the return value from main() back to the operating system (see [1] for Linux, for example). This is perhaps most obvious in C programs for Microsoft Windows, which don't have the traditional main() but do have a WinMain() in its place. It looks like at least glibc uses _start as the actual entry point [2]. In turn, on x86-64 (and very likely also on other architectures), that calls __libc_start_main(), which in turn calls main() via a function pointer passed to it. [1]: https://refspecs.linuxbase.org/LSB_3.1.0/LSB-generic/LSB-generic/baselib---libc-start-main-.html [2]: https://blogs.oracle.com/linux/hello-from-a-libc-free-world-part-1-v2 -- Michael Kjörling • https://michael.kjorling.se • michael at kjorling.se “Remember when, on the Internet, nobody cared that you were a dog?” From michael at kjorling.se Sat May 23 06:19:43 2020 From: michael at kjorling.se (Michael =?utf-8?B?S2rDtnJsaW5n?=) Date: Fri, 22 May 2020 20:19:43 +0000 Subject: [TUHS] History of popularity of C In-Reply-To: <22671.1590172811@hop.toad.com> References: <20200521182817.08C0318C093@mercury.lcs.mit.edu> <202005221109.04MB92D3016090@freefriends.org> <22671.1590172811@hop.toad.com> Message-ID: On 22 May 2020 11:40 -0700, from gnu at toad.com (John Gilmore): > Whereas in C++, Pascal, Python, APL, etc, a few characters can cause the > generated code to do immense amounts of unexpected work. Think of > string compares, hash table types, object initializers, or arbitrary > amounts of jumping through tables of pointers to different kinds of > objects. Automated memory allocation. Garbage collection. What you wrote is pretty much my take on the subject as well. However, part of me wants to say "let's not compare apples to airplanes just because both start with 'a' and one can typically be placed within the other". C++ adds a ton of features on top of C, never mind early C, though for the features that at least earlier C has (I'm honestly not sure about the newer additions), C++ has very similar or downright identical syntax compared to C. As long as you stay with the basic C feature set, I strongly suspect that most programmers who can follow along in the C to assembler to machine code compilation process, can do much the same thing with C++. It's when you start piling all the extras on top of it that things get hairy from a code generation perspective. Vectors? Function overloading? Exceptions? RAII? Try predicting the execution order of destructors during exception handling for classes with multiple inheritance where multiple inherited-from classes define destructors. Anything else? :-) -- Michael Kjörling • https://michael.kjorling.se • michael at kjorling.se “Remember when, on the Internet, nobody cared that you were a dog?” From gnu at toad.com Sat May 23 06:39:25 2020 From: gnu at toad.com (John Gilmore) Date: Fri, 22 May 2020 13:39:25 -0700 Subject: [TUHS] History of popularity of C (GCC/Cygnus) In-Reply-To: <202005221109.04MB92D3016090@freefriends.org> References: <20200521182817.08C0318C093@mercury.lcs.mit.edu> <202005221109.04MB92D3016090@freefriends.org> Message-ID: <28538.1590179965@hop.toad.com> Tyler Adams wrote: > > Is it because adding > > a backend to gcc was free, C was already well known, and C was sufficiently > > performant? arnold at skeeve.com wrote: > Cygnus Solutions (Hi John!) had a lot to do with this. They specialized > in porting GCC to different processors used in embedded systems and > provided support. First things first. When figuring out what happened and what became popular, it's important to look at where money was flowing. Economics will tell you things about systems that you can't learn any other way. Second, until the embedded market included 32-bit processors, gcc was unknown in it. 32-bit was way less than 1% of the embedded market; only in multi-thousand dollar things like laser printers (Adobe/Apple LaserWriter) and network switches (3Com/Cisco/etc). Cygnus ended up with lots of those companies as support customers, because they were sick of porting their code through a different compiler for each new generation of hardware platforms. But we had zero visibility into the vast majority of the embedded market. We went there because even our tiny niche of it was huge, many times the size of the market for native compilers, and with much more diversity of customers. Early on, GCC had the slight advantage that because it was free (as in both beer and speech) and had an email community of maintainers, many people had started ports to different architectures. Only a few of those were production-quality, but they each offered at least a starting point, and attracted interested users who might pay us to make them real. Cygnus was able to deliver production compilers for each new architecture for significantly less than the other companies building compilers for embedded systems. I think that had more to do with our pricing strategy than the actual cost of modifying the compiler. Our main competitors were half a dozen small, fat, lazy companies who charged $10,000 PER SEAT for cross-compilers and charged the chipmaker $1,000,000 and frequently more, to do a port to their latest chip. Cygnus charged chipmakers $500K for a brand new architecture, and $0 per seat, which caused us to eat our competitors' lunch over our first 3 to 5 years. Then we hired someone who knew more about pricing, and raised our own port price to a larger fraction of what the market would bear, to get better margins while still winning deals. We built the first 64-bit ports of GCC and the tools, for the SPARC when SPARC-64 was still secret, and later for other architectures. (Sun's hardware diagnostics group funded our work. They needed to be able to compile their chip and system tests, a full year before Sun's compiler group could deliver 64-bit compilers for customers.) A lot of what got done to make GCC a standard, production worthy compiler had little to do with the code generation. For example, many customers really wanted cross-compilers hosted on DOS and Windows, as well as on various UNIX machines, so we ended up hiring the genius who created djgpp, DJ Delorie, and making THAT into a commercial quality supported product. (We also hired the guy who made GCC run in the old MacOS development environment (Stan Shebs) and one of the senior developers and distributors for free Amiga applications (Fred Fish).) We had to vastly improve the testing infrastructure for the compiler and tools. We designed and built DejaGnu (Rob Savoye's work), and with each bug report, we added to a growingly serious free C and C++ compiler test suite. We automated enough of our build process that we could compare the code produced by different host systems for the same target platform. (The "reproducible builds" teams are now trying to do that for the whole Linux distribution code base.) DejaGnu actually ran our test suite on multiple target platforms with different target architectures, downloading the binaries over serial ports and jumping to them, and compared the tests' output so we could fix any discrepancies that were target-dependent. We hired full-time writers (initially Roland Pesch, who had been a serious programmer earlier in life) to write real manuals and other documentation. We wrote an email-based bug tracking system (PRMS), and the first working over-the-Internet version control system (remote cvs). Our customers all used different object file formats, so we wrote new code for format independence (the BFD library) in the assembler, linker, size, nm and ar and other tools (e.g. we created objdump and objcopy). Ultimately Steve Chamberlain wrote us and GNU a brand-new linker which had the flexibility needed for building embedded system binaries and putting code and data wherever the hardware needed it to go. We learned how to hire and manage remote employees, which meant we were able to hire talented gcc and tools hackers from around the country and the world, who jumped at the chance to turn their beloved hobby into a full-time paying gig. We started our own ISP in order to get ourselves good, cheap commercial quality Internet access, and so we could teach our remote employees how to buy and install solid 56kbit/sec Frame Relay connections rather than flaky dialup access. And because we didn't control the master source code for gcc, one of our senior compiler hackers, Jim Wilson, spent a huge fraction of his time merging our changes upstream into FSF GCC, and merging their changes downstream into our product, keeping the ecosystem in sync. We handled that overhead for significant other tools by taking up the whole work of maintenance and release engineering -- for example, I became FSF's maintainer for gdb. I would make an FSF GDB release two weeks before Cygnus would make its own integrated toolchain releases. If bug reports didn't start streaming in within days from the free software community, we knew we had made a solid release; and we had time to patch anything that turned up, before our customers got it from us on cartridge tapes. It wasn't just a compiler, it was a whole ecosystem that had to be built or improved. About half of our employees were software engineers, so by the time our revenues grew from <$1M/year to $25M a year, we were spending about $12M every year improving the free software ecosystem. And because we avoided venture capital for six years, and shared the stock ownership widely among the employees, when we got lucky after 10 years and were acquired by the second free software company to go public (the first was VA Linux, the second Red Hat), all those hackers became millionaires. A case of doing well by doing good. John From beebe at math.utah.edu Sat May 23 07:12:41 2020 From: beebe at math.utah.edu (Nelson H. F. Beebe) Date: Fri, 22 May 2020 15:12:41 -0600 Subject: [TUHS] TUHS] where did "main" come from? Message-ID: Lawrence Stewart asks on Fri, 22 May 2020 14:43:40 -0400: >> So who started "main" and when? I have just checked several PDFs of IBM mainframe and Fortran manuals going back to 1954. The early manuals did not appear to use the name "main", but in ibm-7030/C22-6578_7030_Programming_Examples_Apr61.pdf, the phrase "the main program" occurs in the context of assembly language coding. In ibm-7030/C22-6751_7030_FORTRAN_IV_May63.pdf, on page 22, which begins with Part II. FORTRAN Programming for the IBM 7030 the first paragraph ends with "the main program". In silliac/SPMpart1-ocr.pdf, titled Silliac Programming Manual The Adolph Basser Computing Laboratory School of Physics The University of Sydney and dated January 1959, on page 99, I find >> ... >> In this way a program is seen to consist of several distinct, >> self-contained blocks, namely the various subroutines and the part of >> the program (usually called the main program or master routine) which >> ^^^^^^^^^^^^^^^^ >> makes use of its subroutines by sending control to them. >> ... I have many manuals for older systems, but most have not been subjected to optical-character recognition, so it is difficult to find specific text in them. Nevertheless, I have demonstrated that by at least January 1959, the phrase "main program" was common enough to appear in computer documentation, qualified by "usually". Some day, perhaps I'll find time to do OCR conversion on my extensive PDF file archives. I'll be pleased to hear of earlier uses of "main program" from TUHS list members. ------------------------------------------------------------------------------- - Nelson H. F. Beebe Tel: +1 801 581 5254 - - University of Utah FAX: +1 801 581 4148 - - Department of Mathematics, 110 LCB Internet e-mail: beebe at math.utah.edu - - 155 S 1400 E RM 233 beebe at acm.org beebe at computer.org - - Salt Lake City, UT 84112-0090, USA URL: http://www.math.utah.edu/~beebe/ - ------------------------------------------------------------------------------- From clemc at ccc.com Sat May 23 07:52:17 2020 From: clemc at ccc.com (Clem Cole) Date: Fri, 22 May 2020 17:52:17 -0400 Subject: [TUHS] where did "main" come from? In-Reply-To: References: Message-ID: It's interesting, I was thinking about this the other day too. I remember talking about the 'main program' in Fortran when I was learning. I never thought about it when I saw it in C, other than, ok that's how you pass command line args, which I thought was really clean. I remember TOPS and TSS you had to go rummaging around to get to them. As for your BCPL question, START() was way I learned it. I think I first saw it on the 360s or maybe the 1108; but really never did much it until I saw the first Altos. Clem On Fri, May 22, 2020 at 2:53 PM Lawrence Stewart wrote: > C main programs define “main”. > This also seems to be true of B main programs, according to the > Johnson/Kernighan manual > The 1967 Martin Richards BCPL manual doesn’t explain how programs get > started > The 1974 update from Martin Richards says there should be an OS addendum > that explains this. > The 1974 University of Essex BCPL manual says to use START > The 1979 Parc Alto BCPL manual uses Main and I think that must be > unchanged from 1972. > The AMSTRAD BCPL guide from 1986 uses start() > > > So who started “main” and when? I can’t find an online copy of the Bell > Laboratories BCPL manual (Canaday/Thompson) from 1969 or anything about how > to use BCPL on Multics or CTSS. > > -L > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From robpike at gmail.com Sat May 23 08:00:06 2020 From: robpike at gmail.com (Rob Pike) Date: Sat, 23 May 2020 08:00:06 +1000 Subject: [TUHS] where did "main" come from? In-Reply-To: References: Message-ID: After joining the Labs I mentioned to Dennis that the idea of "main" did not appear in the semi-formal specification of C in the back of the first edition of The C Programming Language. (It obviously appears in the front half.) He was very surprised, but maybe it's an indication that the idea of "main" was already part of the culture. I first encountered it in the phrase PROCEDURE OPTIONS MAIN that begins every PL/I program. PL/I was defined in the early 1960's. -rob -------------- next part -------------- An HTML attachment was scrubbed... URL: From charles.unix.pro at gmail.com Sat May 23 08:08:15 2020 From: charles.unix.pro at gmail.com (Charles Anthony) Date: Fri, 22 May 2020 15:08:15 -0700 Subject: [TUHS] where did "main" come from? In-Reply-To: References: Message-ID: On Fri, May 22, 2020 at 11:54 AM Lawrence Stewart wrote: > I can’t find an online copy of the Bell Laboratories BCPL manual > (Canaday/Thompson) from 1969 or anything about how to use BCPL on Multics > or CTSS. In general, Multics does not have a concept of "main"; the entry point of a program is the name of the program. Looking at the Multics runoff sources (written in BCPL) we see in the segment runoff_driver.bcpl: external $( RunoffCommand = "runoff" ... let RunoffCommand () be main $( MONITOR := Open (StreamName + Write, "error_output") // Errors, etc. written here. ... BCPL replaces the name RunoffCommand with runoff during compilation; the compiled segment runoff_driver will have an entry point "runoff". Entering the command "runoff" will search segments in the search path for that entry point. Since segments can have multiple entry points, the idea of "main" (or "start" or "start_") as the defining entry point is not meaningful in Multics. The Multics C compiler (based on PCC) does have a concept of main; the C linker aliases that to the segment name, making the entry point name the same as the segment name. Thus compiling and linking foo.c generates a segment foo with an entry point foo, which points to main. (Actually, it aliases it to the C runtime library initialization which calls main.) -- Charles -- Charles -------------- next part -------------- An HTML attachment was scrubbed... URL: From toby at telegraphics.com.au Sat May 23 09:33:32 2020 From: toby at telegraphics.com.au (Toby Thain) Date: Fri, 22 May 2020 19:33:32 -0400 Subject: [TUHS] where did "main" come from? In-Reply-To: References: Message-ID: <5cef3bd0-5dfc-2d4a-cdde-9d7b03dea353@telegraphics.com.au> On 2020-05-22 5:52 PM, Clem Cole wrote: > It's interesting, I was thinking about this the other day too.   I > remember talking about the 'main program' in Fortran when I was > learning.  I never thought about it when I saw it in C, other than, ok > that's how you pass command line args, which I thought was really > clean.   I remember TOPS and TSS you had to go rummaging around to get > to them.   > > As for your BCPL question, START() was way I learned it.  I think I > first saw it on the 360s or maybe the 1108; but really never did much it > until I saw the first Altos. This chart could lead to some predictable conclusions, don't know if they are correct: https://books.google.com/ngrams/graph?content=main+program&year_start=1930&year_end=2008&corpus=17&smoothing=3&share=&direct_url=t1%3B%2Cmain%20program%3B%2Cc0 > > Clem   > > On Fri, May 22, 2020 at 2:53 PM Lawrence Stewart > wrote: > > C main programs define “main”. > This also seems to be true of B main programs, according to the > Johnson/Kernighan manual > The 1967 Martin Richards BCPL manual doesn’t explain how programs > get started > The 1974 update from Martin Richards says there should be an OS > addendum that explains this. > The 1974 University of Essex BCPL manual says to use START > The 1979 Parc Alto BCPL manual uses Main and I think that must be > unchanged from 1972. > The AMSTRAD BCPL guide from 1986 uses start() > > > So who started “main” and when?  I can’t find an online copy of the > Bell Laboratories BCPL manual (Canaday/Thompson) from 1969 or > anything about how to use BCPL on Multics or CTSS. > > -L > From woods at robohack.ca Sat May 23 09:50:33 2020 From: woods at robohack.ca (Greg A. Woods) Date: Fri, 22 May 2020 16:50:33 -0700 Subject: [TUHS] History of popularity of C In-Reply-To: <8a2e9b1b-8890-a783-5b53-c8480c070f2e@telegraphics.com.au> References: <8a2e9b1b-8890-a783-5b53-c8480c070f2e@telegraphics.com.au> Message-ID: I always assumed C became popular because there was a very large cohort of programmers who started with it as their first language, usually on early Unix, at university, in the late very 1970s and early 1980s. After all if I was exposed to it a small Canadian university in the early 1980s, then surely it was almost everywhere! At least that's how it happened for me. I was already fluent in BASIC and reasonably good at Pascal before I went to university, and though we had a very wide variety of languages to work with since we had accounts on both Unix and Multics systems right from the start of first year, C was the strong favourite amongst both juniors and phds, i.e. all but the most die-hard Multics lovers (who of course used and loved PL/1, though by 1985 there was even talk of C on Multics). Some of this popularity of C was no doubt due to the fact that those a year or two ahead of me had started with FORTRAN on an IBM 370 and had absolutely hated it and were very vocal to those of us coming up behind that we were very lucky to jump right onto the Unix (and Multics) machines right from the start. My first job programming in 1983/84 was back to BASIC and assembler, but a year later and I was writing C again (though sadly mostly on MS-DOS, briefly on Xenix, then back to very early MS-Windows until about 1988 -- not long in hindsight, but it was painful). At Thu, 21 May 2020 12:10:35 -0400, Toby Thain wrote: Subject: Re: [TUHS] History of popularity of C > > - inexpensive compiler availability was not very good until ~1990 or > later, but C had been taking off like wildfire for 10 years before that Well, there were a plethora of both full C and "tiny"/"small" C compilers widely available in the very early 1980s. Indeed I would say inexpensive C compilers were widely available and very popular well before 1985, and a few "toy/tiny" compilers were freely available by then too. By 1985 I was doing C development, primarily on MS-DOS systems, using commercial compilers, for a wide variety of projects, mostly in big national companies (in Canada, such as CP Rail). I would say C was the first commercially successful systems-level language available across many platforms, and that this was evidently so by 1985. Early Atari (6502) computers were partly programmed with a cross- compiler, though I've no idea what it was (possibly a re-targeted PCC). I think VisiCalc had similar origins. The most ground-breaking C compiler might arguably have been P.J.Plauger's Whitesmiths C compiler, around about 1978. I don't think it was what you'd call "inexpensive" necessarily, but it was popular. The BD Software company's C compiler for CP/M (8080/z80) was released in 1979. The first version of Mark Williams C came out very early, possibly before 1980. I owned a copy for MS-DOS 386 by 1985/86. This was the most Unix-like compiler and library, by far, and quite inexpensive (else I wouldn't have been able to afford my own personal copy). Small-C appeared in Dr.Dobb's in May 1980 (and it spawned a plethora of derivatives of its own). C was everywhere in personal computing literature by 1980. I believe Aztec C was first released in 1980. Two books about C were published by McGraw-Hill in 1982: "The C Primer", Les Hancock and Morris Krieger; and "The C Puzzle Book", Alan R. Feuer. There were likely more. Then there was Lattice C, out and about by 1982 and VERY popular and widely used by 1984. (I was using the second version in 1985/1986 on PCs. It's probably the buggiest compiler I've ever used for real work projects.) "Learning to Program in C" by Thomas Plum was published 1983. And of course there was Tanenbaum and Jacobs' ACK, with a C parser front-end in the early 1980s (even by 1980?). Brad Templeton wrote a C (or maybe Tiny-C) compiler for C64/6502 around about 1984 (though he only commercialized the "PAL" assembler I think). In my estimation GCC really only served to cement C's early success and popularity. It gave people certainty that a good C compiler would be available for most any platform no matter what happened. I would also argue that non-Unix C compilers actually drove the adoption curve of C. Pascal tried to play catch-up, but just as with what happened to me in university where it was one of the teaching languages, C was just far more popular and though Pascal had a tiny head-start (in terms of first-published books/manuals), C overtook it and had far more staying power too (though indeed in the late 1980s there was a fair battle going on in the pc/mac/amiga/etc world for Pascal). -- Greg A. Woods Kelowna, BC +1 250 762-7675 RoboHack Planix, Inc. Avoncote Farms -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 195 bytes Desc: OpenPGP Digital Signature URL: From treese at acm.org Sat May 23 09:58:31 2020 From: treese at acm.org (Win Treese) Date: Fri, 22 May 2020 19:58:31 -0400 Subject: [TUHS] where did "main" come from? In-Reply-To: <5cef3bd0-5dfc-2d4a-cdde-9d7b03dea353@telegraphics.com.au> References: <5cef3bd0-5dfc-2d4a-cdde-9d7b03dea353@telegraphics.com.au> Message-ID: <8B69B753-0713-430E-8A1A-AE23206127AC@acm.org> > On May 22, 2020, at 7:33 PM, Toby Thain wrote: > >> As for your BCPL question, START() was way I learned it. I think I >> first saw it on the 360s or maybe the 1108; but really never did much it >> until I saw the first Altos. > > This chart could lead to some predictable conclusions, don't know if > they are correct: > > https://books.google.com/ngrams/graph?content=main+program&year_start=1930&year_end=2008&corpus=17&smoothing=3&share=&direct_url=t1%3B%2Cmain%20program%3B%2Cc0 This is straying way off topic, but I thought it would be interesting to look at a couple of older sources about it, say, from the late 1940s when there were computers with programs. A search on Google Books in the date range 1800-1950 gives a lot of hits (at least 10 screens worth). Since it’s Google, your results may vary, but here are the first few I got: - US Congress hearings on National Health Program, 1946: "Any discussion on either the main program or the amendment to…” Fair enough. Got the phrase, it’s in the range. Just not relevant to the current investigation. - C Programming: Test Your Skills by Ashok Kamthane, dated by Google Books to 1900 - Information Circular, dated 1925, with the excerpt starting “It consists of a main program, two subroutine subprograms, A macro-flow chart of the program is shown on figure A-1”. Which seems odd, because that pretty clearly isn’t really from 1925. So I clicked through the document, which turns out to be Information Circular 1601: Corrosion Resistance of Metals in Hot Brines: A Literature Review” published by the US Bureau of Mines in 1973. It also does not have the excerpt in the document. - Programming Techniques Through C: A Beginners Companion by M. G. Venkateshmurthy, dated 1900. - Technical Bulletin, Issues 206-216, dated 1922 with the excerpt "The main program (MAIN) will be discussed first and then each of the subroutines called by the main program. Sometimes the subroutines called by the main program call other subroutines.”. Clicking through to it gives the response “No results in this book for ‘main program’”. This turns out to be incorrect, because somewhere between the Google Books server and my Safari browser the search string was mishandled. Changing it in the search box gives 3 snippets referencing “main program”, and the document is apparently about a FORTRAN program compiled with the CDC6400 FTN version 3 compiler. However, nothing more than the snippets is available and the 1922 date is obviously wrong. Remaining items on the first page are similarly clearly misdated or about non-computer main programs. No one said archival research is easy, but Google Books does present itself as having better data than it delivers. - Win From thomas.paulsen at firemail.de Sat May 23 14:33:07 2020 From: thomas.paulsen at firemail.de (Thomas Paulsen) Date: Sat, 23 May 2020 06:33:07 +0200 Subject: [TUHS] History of popularity of C (GCC/Cygnus) In-Reply-To: <28538.1590179965@hop.toad.com> References: <20200521182817.08C0318C093@mercury.lcs.mit.edu> <202005221109.04MB92D3016090@freefriends.org> <28538.1590179965@hop.toad.com> Message-ID: <5ef1f056cf19b3c157fc3a8f769b4494@firemail.de> >Early on, GCC had the slight advantage that because it was free (as in >both beer and speech) and had an email community of maintainers I remember that we started moving to gcc, gmake, etc,, because these tools performed simply spoken better than the native SNI ones. From akosela at andykosela.com Sat May 23 17:28:25 2020 From: akosela at andykosela.com (Andy Kosela) Date: Sat, 23 May 2020 09:28:25 +0200 Subject: [TUHS] History of popularity of C In-Reply-To: References: <8a2e9b1b-8890-a783-5b53-c8480c070f2e@telegraphics.com.au> Message-ID: On 5/23/20, Greg A. Woods wrote: > > I would also argue that non-Unix C compilers actually drove the adoption > curve of C. Pascal tried to play catch-up, but just as with what > happened to me in university where it was one of the teaching languages, > C was just far more popular and though Pascal had a tiny head-start (in > terms of first-published books/manuals), C overtook it and had far more > staying power too (though indeed in the late 1980s there was a fair > battle going on in the pc/mac/amiga/etc world for Pascal). This is my recollection as well. In the late 80s with the introduction of really nice compilers for MS-DOS like Turbo C from Borland (1987), Watcom C 6.0 (1988) and mature versions of Microsoft C (which originally was based on Lattice C), the C future was solidified. The documentation coming with those compilers were also excellent. I still have tons of reference books from that period. It was a time when almost everybody was using pure C. I think C++ needed another 5-7 years to displace C in the application market. --A From clemc at ccc.com Sun May 24 03:08:28 2020 From: clemc at ccc.com (Clem Cole) Date: Sat, 23 May 2020 13:08:28 -0400 Subject: [TUHS] History of popularity of C In-Reply-To: References: <8a2e9b1b-8890-a783-5b53-c8480c070f2e@telegraphics.com.au> Message-ID: On Fri, May 22, 2020 at 7:51 PM Greg A. Woods wrote: > I always assumed C became popular because there was a very large cohort > of programmers who started with it as their first language, usually on > early Unix, at university, in the late very 1970s and early 1980s. > Exactly - my giving away UNIX, it cemented the language and the technology into a group of young engineers (like me) who then 'spread the gospel' when we went to real jobs. > Well, there were a plethora of both full C and "tiny"/"small" C > compilers widely available in the very early 1980s. > Yep -- I listed a little of the pre-history. > > Indeed I would say inexpensive C compilers were widely available and > very popular well before 1985, and a few "toy/tiny" compilers were > freely available by then too. Yup, although until the 386 and the DOS extenders, it could be tough to use with the Gordon's awful 'far pointer' infection. > Early Atari (6502) computers were partly programmed with a cross- > compiler, though I've no idea what it was (possibly a re-targeted PCC). > Most 6502 shops were assembler, although you are correct cc65 shows up reasonably early. It was not PCC based. > I think VisiCalc had similar origins. > Dan Bricklin wrote it assembler. He had access to the same Harvard PDP-10 that Gates and Allen had used to write MITS Basic a few years earlier. I should ask him to be sure, but I was under the impression he used the SAIL based 6502 assembler I mentioned previously.[1] > > The most ground-breaking C compiler might arguably have been > P.J.Plauger's Whitesmiths C compiler, around about 1978. I don't think > it was what you'd call "inexpensive" necessarily, but it was popular. > Other than his wretched 'anat' - a natural assembler, which was far from natural. But you are correct, particularly for non-UNIX boxes, he had the first 'widely used' compiler. > In my estimation GCC really only served to cement C's early success and > popularity. It gave people certainty that a good C compiler would be > available for most any platform no matter what happened. > I would agree. C had already been 'winning' by the time of gcc, and offering a compiler that was so portable and generated 'reasonable' code (sometimes even better than some of the commercial ones) I think was the winning score. > > I would also argue that non-Unix C compilers actually drove the adoption > curve of C. > I would put a small accent on that. I think the C compilers that targeted non-UNIX systems, and in particular the microprocessors were the driver. The micro's started with assembler in most cases. Basic shows up and is small, but it's not good enough for real products like VisiCalc or later Lotus. Pascal tries to be the answer, but I think it suffered from the fact that it makes Pascal a production quality language, you had a extend it and everybody's extensions were different. So, C came along and was 'better than assembler' and allowed 'production quality code' to be written, but with the exception of the far pointer stuff, pretty much worked as dmr had defined it for the PDP-11. So code could be written to work between compilers and systems. When the 386 DOS extenders show up, getting rid of far, and making it a 32-bit based language like the Vax and 68000, C had won. Clem 1.] FWIW: Bricklin I know socially. He was one of my brother's quad-mates at HBS in 1978-79 when he wrote VisiCalc to do his homework [the story is on the Wikipedia page]. In fact, there is now a plaque in the shared lounge over the nook where his study carrel was when he wrote it. The four of them all did pretty well. You know Dan's story, his roommate went on to found Staples, my brother's roommate became the CEO of Pepsi, and my brother ran Milcron, then founded a materials handling firm that did the automation for Amazon (and he sold the firm a few years ago to Honeywell). Also, their section-mate was Clay Christensen of the 'Innovators Dilemma' fame and of course classmate Meg Whitman would do eBay. Pretty impressive class from HBS. -------------- next part -------------- An HTML attachment was scrubbed... URL: From rich.salz at gmail.com Sun May 24 03:22:14 2020 From: rich.salz at gmail.com (Richard Salz) Date: Sat, 23 May 2020 13:22:14 -0400 Subject: [TUHS] History of popularity of C In-Reply-To: References: <8a2e9b1b-8890-a783-5b53-c8480c070f2e@telegraphics.com.au> Message-ID: Also around that time was Leor zolman BDS C compiler for MSDOS. used by Mark of the unicorn for their MINCE editor and Scribble word processor. Vince is not complete emacs, and you can figure out where scribble came from. I bought a motorcycle off Leor :-) -------------- next part -------------- An HTML attachment was scrubbed... URL: From dfawcus+lists-tuhs at employees.org Sun May 24 04:42:33 2020 From: dfawcus+lists-tuhs at employees.org (Derek Fawcus) Date: Sat, 23 May 2020 19:42:33 +0100 Subject: [TUHS] History of popularity of C In-Reply-To: References: <8a2e9b1b-8890-a783-5b53-c8480c070f2e@telegraphics.com.au> Message-ID: <20200523184233.GA21301@clarinet.employees.org> On Sat, May 23, 2020 at 01:08:28PM -0400, Clem Cole wrote: > So, C came along and was 'better than assembler' and allowed 'production > quality code' to be written, but with the exception of the far pointer > stuff, pretty much worked as dmr had defined it for the PDP-11. So code > could be written to work between compilers and systems. When the 386 DOS > extenders show up, getting rid of far, and making it a 32-bit based > language like the Vax and 68000, C had won. Certainly having a flat 32 bit compiler was eventually useful, but even prior to that the impact of 'far' pointers wasn't always an issue. For simple tasks, one simpy ignored it (wrote w/o 'far'), and the compiled as either small or large memory model. It was only if one wanted to optimise the code that 'far' became an issue, and a lot of code was never shipped, so didn't need to be so optimised. Even a lot of the shipped code I worked on with those DOS based compilers simply used large memory model, and ignored 'far'. More of an issue was the segmented memory, and that structures couldn't be larger than 64k. For targetting DOS, compilers eventually offered 'huge' pointers, and possibly a 'huge' memory model which hid the problem; but were of no use in protected 16 bit mode - which the embedded RT-OS I was developing for at the time used. DF From michael at kjorling.se Sun May 24 05:28:27 2020 From: michael at kjorling.se (Michael =?utf-8?B?S2rDtnJsaW5n?=) Date: Sat, 23 May 2020 19:28:27 +0000 Subject: [TUHS] History of popularity of C In-Reply-To: References: <8a2e9b1b-8890-a783-5b53-c8480c070f2e@telegraphics.com.au> Message-ID: On 23 May 2020 13:08 -0400, from clemc at ccc.com (Clem Cole): >> I would also argue that non-Unix C compilers actually drove the adoption >> curve of C. > > I would put a small accent on that. I think the C compilers that targeted > non-UNIX systems, and in particular the microprocessors were the driver. > The micro's started with assembler in most cases. Basic shows up and is > small, but it's not good enough for real products like VisiCalc or later > Lotus. Pascal tries to be the answer, but I think it suffered from the > fact that it makes Pascal a production quality language, you had a extend > it and everybody's extensions were different. There's also the issue that, even once you get into compiled BASIC territory, those wretched vendor-unique extensions show up again. Try porting, say, a non-trivial program written for QuickBASIC to Turbo BASIC even on the same PC. Both Pascal and BASIC are hard to extend by the programmer who's actually using them to try to write useful end-user software, _particularly_ in ways that fit into the rest of the code, so you're essentially stuck with what the compiler vendor thought you would need, or what they thought you would be willing to pay for, in memory or money. On the flip side, much of C's magic really isn't in the language (which is quite, pardon me, basic), but rather in the standard library. Yes, C('s standard library) ended up with its share of vendor-specific extensions as well, but the language itself actually gave the programmer the building blocks needed to, if necessary, even implement those extensions for a different compiler; most often without resorting to more than minimal amounts of assembler, and often outright none. So you weren't stuck with what the compiler vendor gave you; it was actually possible to effectively _extend_ the language vocabulary yourself, if you felt a need to do that. I didn't do serious enough programming back during those days for that to matter to me, but now that I get paid to write software, I definitely come across situations at times where the ability to extend the language in such a manner (and have the code using those extensions read idiomatically for the language) is awful nice. -- Michael Kjörling • https://michael.kjorling.se • michael at kjorling.se “Remember when, on the Internet, nobody cared that you were a dog?” From steve at quintile.net Mon May 25 10:11:09 2020 From: steve at quintile.net (Steve Simon) Date: Mon, 25 May 2020 01:11:09 +0100 Subject: [TUHS] main Message-ID: <9E3618DA-1B14-4928-826F-4087F02DB783@quintile.net> re: main. i was surprised not to see any mention of the entry keyword reserved but, i believe, never used in early c. it is listed in k&r ed1. i always assumed this was to allow the author to choose an alternative to main() for the program’s entry point, but we all know what assumption is... -Steve From clemc at ccc.com Mon May 25 11:12:23 2020 From: clemc at ccc.com (Clem Cole) Date: Sun, 24 May 2020 21:12:23 -0400 Subject: [TUHS] main In-Reply-To: <9E3618DA-1B14-4928-826F-4087F02DB783@quintile.net> References: <9E3618DA-1B14-4928-826F-4087F02DB783@quintile.net> Message-ID: No, it’s a FORTRAN-ism. That allows multiple entries into a subroutine. On Sun, May 24, 2020 at 8:53 PM Steve Simon wrote: > > re: main. > > i was surprised not to see any mention of the entry keyword reserved but, > i believe, never used in early c. it is listed in k&r ed1. > > i always assumed this was to allow the author to choose an alternative to > main() for the program’s entry point, but we all know what assumption is... > > -Steve > > > -- Sent from a handheld expect more typos than usual -------------- next part -------------- An HTML attachment was scrubbed... URL: From thomas.paulsen at firemail.de Mon May 25 21:37:21 2020 From: thomas.paulsen at firemail.de (Thomas Paulsen) Date: Mon, 25 May 2020 13:37:21 +0200 Subject: [TUHS] main In-Reply-To: <9E3618DA-1B14-4928-826F-4087F02DB783@quintile.net> References: <9E3618DA-1B14-4928-826F-4087F02DB783@quintile.net> Message-ID: >i was surprised not to see any mention of the entry keyword reserved but, >i believe, never used in early c. it is listed in k&r ed1. >i always assumed this was to allow the author to choose an alternative to >main() for the program’s entry point, but we all know what assumption is... older CC's came with the so-called startup code where one could increase the number of file handle and so on before compiling and linking the resulting .o to the application object files. Of course one could rename the invocation of main to something like beatles. ;-) Even M$ msc supplied the startup code. -Steve From dave at horsfall.org Tue May 26 14:21:13 2020 From: dave at horsfall.org (Dave Horsfall) Date: Tue, 26 May 2020 14:21:13 +1000 (EST) Subject: [TUHS] History of popularity of C In-Reply-To: References: <8a2e9b1b-8890-a783-5b53-c8480c070f2e@telegraphics.com.au> Message-ID: On Sat, 23 May 2020, Clem Cole wrote: > [...]  Pascal tries to be the answer, but I think it suffered from the > fact that it makes Pascal a production quality language, you had a > extend it and everybody's extensions were different. Perhaps I'm the only one here, but when I was taught Pascal (possibly by Dr. Lions himself) it was emphasised to us that it was not a production language bur a *teaching* language; you designed your algorithm, debugged it with the Pascal compiler, then hand-translated it into your favourite language (and debugged it again :-/). That damned "pre-fill read buffer" was always a swine with interactive sessions, though; I recall Andrew Hume threatening to insert a keyboard into the terminal's CRT if he saw that "?" prompt on the Cyber... -- Dave From erc at pobox.com Tue May 26 14:32:09 2020 From: erc at pobox.com (Ed Carp) Date: Mon, 25 May 2020 23:32:09 -0500 Subject: [TUHS] History of popularity of C In-Reply-To: References: <8a2e9b1b-8890-a783-5b53-c8480c070f2e@telegraphics.com.au> Message-ID: "Perhaps I'm the only one here..." You're not. I was taught the same thing. It was never intended to be a production language.

Virus-free. www.avast.com
From robpike at gmail.com Tue May 26 18:21:47 2020 From: robpike at gmail.com (Rob Pike) Date: Tue, 26 May 2020 18:21:47 +1000 Subject: [TUHS] History of popularity of C In-Reply-To: References: <8a2e9b1b-8890-a783-5b53-c8480c070f2e@telegraphics.com.au> Message-ID: The peculiar input semantics of Pascal are a consequence of a locally hacked-up version of NOS (I think that's the name) that ran on the big CDC machines at ETH in Zurich. It was entirely a card-based system then, and the way Pascal required read-ahead worked perfectly on that system, but not really on any other, including other card-based, even NOS systems. I was told this when I worked on that same machine as an exchange student working at EIR outside Zurich, but not by Wirth himself. I couldn't bring myself to ask him personally. -rob -------------- next part -------------- An HTML attachment was scrubbed... URL: From clemc at ccc.com Wed May 27 00:32:43 2020 From: clemc at ccc.com (Clem Cole) Date: Tue, 26 May 2020 10:32:43 -0400 Subject: [TUHS] History of popularity of C In-Reply-To: References: <8a2e9b1b-8890-a783-5b53-c8480c070f2e@telegraphics.com.au> Message-ID: On Tue, May 26, 2020 at 12:22 AM Dave Horsfall wrote: > On Sat, 23 May 2020, Clem Cole wrote: > > > [...] Pascal tries to be the answer, but I think it suffered from the > > fact that it makes Pascal a production quality language, you had a > > extend it and everybody's extensions were different. > > Perhaps I'm the only one here, but when I was taught Pascal (possibly by > Dr. Lions himself) it was emphasised to us that it was not a production > language bur a *teaching* language; you designed your algorithm, debugged > it with the Pascal compiler, then hand-translated it into your favourite > language (and debugged it again :-/). > > Dave that was exactly my point. Pascal was designed as a teaching language so Wirth did not put things into the language that made it helpful as a production language. So everyone else tried and the language became a mess. Everybody peed on it. Dennis' quote: “When I read commentary about suggestions for where C should go, I often think back and give thanks that it wasn't developed under the advice of a worldwide crowd.” It's not that you could not turn Pascal into a production language, but every attempt to try to do so was done in a different manner. And within firms it was always different. Eight different 'Tek Pascal' implementations -- all close, but different - he says shaking his head. -------------- next part -------------- An HTML attachment was scrubbed... URL: From clemc at ccc.com Wed May 27 00:44:18 2020 From: clemc at ccc.com (Clem Cole) Date: Tue, 26 May 2020 10:44:18 -0400 Subject: [TUHS] History of popularity of C In-Reply-To: References: <8a2e9b1b-8890-a783-5b53-c8480c070f2e@telegraphics.com.au> Message-ID: On Tue, May 26, 2020 at 4:23 AM Rob Pike wrote: > The peculiar input semantics of Pascal are a consequence of a locally > hacked-up version of NOS (I think that's the name) that ran on the big CDC > machines at ETH in Zurich. It was entirely a card-based system then, and > the way Pascal required read-ahead worked perfectly on that system, but not > really on any other, including other card-based, even NOS systems. > Yep, NOS was always a real mess. The ASCII vs 6-bit Display code got mixed up in this too, IIRC. But again, if you think of Pascal as a teaching language under a batch system, where the student tosses in her/his program and some data to run against it. The batch queue eventually picks up your 'job', tries to compile the code, and if successful will run the executable it once on your input deck - a small light comes on. Yeah it does that just fine and it is a pretty simple model. BTW: a number of those local NOS hacks were to make the system easier to use with student batch files. I think it was Ward Cunningham that told me in the late 1970s, ETH got some of those NOS hacks from Purdue - Ward had been working in the Purdue computer center and he sent the CDC tape to them (remember Purdue was late to the Arpanet and I do not ETH was one of the few places in Europe that had connections). Sending mag tapes via mail or maybe FedEx/DHL was pretty standard in those days. Particularly within Universities, shops with the same hardware and/or OS tended to share a lot of tricks and solutions to issues. FWIW: that particular 6500 from Purdue is now at the LCM+L in Seattle. -------------- next part -------------- An HTML attachment was scrubbed... URL: From toby at telegraphics.com.au Wed May 27 01:19:19 2020 From: toby at telegraphics.com.au (Toby Thain) Date: Tue, 26 May 2020 11:19:19 -0400 Subject: [TUHS] History of popularity of C In-Reply-To: References: <8a2e9b1b-8890-a783-5b53-c8480c070f2e@telegraphics.com.au> Message-ID: On 2020-05-26 12:21 AM, Dave Horsfall wrote: > On Sat, 23 May 2020, Clem Cole wrote: > >> [...]  Pascal tries to be the answer, but I think it suffered from the >> fact that it makes Pascal a production quality language, you had a >> extend it and everybody's extensions were different. > > Perhaps I'm the only one here, but when I was taught Pascal (possibly by > Dr. Lions himself) it was emphasised to us that it was not a production > language bur a *teaching* language; you designed your algorithm, > debugged it with the Pascal compiler, then hand-translated it into your > favourite language (and debugged it again :-/). Prof. Knuth came up with an interesting solution to that -- in the process, inventing (or maturing) the concept of "literate programming". Perhaps it's not well known that his most widely used programs (e.g. TeX) were written in something VERY close to standard Pascal (preprocessing aside). The translation to C (as required by certain platforms) was mechanical. --Toby > > That damned "pre-fill read buffer" was always a swine with interactive > sessions, though; I recall Andrew Hume threatening to insert a keyboard > into the terminal's CRT if he saw that "?" prompt on the Cyber... > > -- Dave From thomas.paulsen at firemail.de Wed May 27 02:00:53 2020 From: thomas.paulsen at firemail.de (Thomas Paulsen) Date: Tue, 26 May 2020 18:00:53 +0200 Subject: [TUHS] History of popularity of C In-Reply-To: References: <8a2e9b1b-8890-a783-5b53-c8480c070f2e@telegraphics.com.au> Message-ID: <9e5933a166ece32b4fb17c6bbb563873@firemail.de> >Dr. Lions himself) it was emphasised to us that it was not a production >language bur a *teaching* language; In the early 90ths I written some larger programs in Turbo Pascal after years of intensively working with my favored C&C++ language, and was surprised how well designed the Borland language was. Thus, recently I installed Free-Pascal with its comfortable IDE and since then I'm wondering why they always inventing new languages as these 'old' C&Pascal languages are so well designed and implemented, that I can't imagine that anything else is really needed. From cbbrowne at gmail.com Wed May 27 02:21:28 2020 From: cbbrowne at gmail.com (Christopher Browne) Date: Tue, 26 May 2020 12:21:28 -0400 Subject: [TUHS] History of popularity of C In-Reply-To: <9e5933a166ece32b4fb17c6bbb563873@firemail.de> References: <8a2e9b1b-8890-a783-5b53-c8480c070f2e@telegraphics.com.au> <9e5933a166ece32b4fb17c6bbb563873@firemail.de> Message-ID: On Tue, 26 May 2020 at 12:01, Thomas Paulsen wrote: > >Dr. Lions himself) it was emphasised to us that it was not a production > >language bur a *teaching* language; > > In the early 90ths I written some larger programs in Turbo Pascal after > years of intensively working with my favored C&C++ language, and was > surprised how well designed the Borland language was. Thus, recently I > installed Free-Pascal with its comfortable IDE and since then I'm wondering > why they always inventing new languages as these 'old' C&Pascal languages > are so well designed and implemented, that I can't imagine that anything > else is really needed. > I remember the fighting going on at that time. I did some Pascal in about 1986, with one of the Waterloo compilers, and found it mildly a pain in the neck; it was a reasonably-nearly-strict version of the academic language, and was painful for non-academic programming for the reasons normally thrown about. In grad school, I TA'ed a course that was using TurboPascal, and it was definitely a reasonable extension towards usability for larger programs that needed more sophisticated environmental interactions. The compiler was decently fast (unlike Ada, anyone??? ;-) ), and the makers were selective and adequately opinionated as to their extensions. And I fully recall the split ongoing, as academic folk would regard TurboPascal as "non-conformant" with the standard, whilst bwk's missive on "Why Pascal Is Not My Favorite Language" provides a good explanation... And bwk nicely observed, "Because the language is so impotent, it must be extended. But each group extends Pascal in its own direction, to make it look like whatever language they really want." The Modula family seemed like the better direction; those were still Pascal-ish, but had nice intentional extensions so that they were not nearly so "impotent." I recall it being quite popular, once upon a time, to write code in Modula-2, and run it through a translator to mechanically transform it into a compatible subset of Ada for those that needed DOD compatibility. The Modula-2 compilers were wildly smaller and faster for getting the code working, you'd only run the M2A part once in a while (probably overnight!) -- When confronted by a difficult problem, solve it by reducing it to the question, "How would the Lone Ranger handle this?" -------------- next part -------------- An HTML attachment was scrubbed... URL: From thomas.paulsen at firemail.de Wed May 27 05:29:57 2020 From: thomas.paulsen at firemail.de (Thomas Paulsen) Date: Tue, 26 May 2020 21:29:57 +0200 Subject: [TUHS] History of popularity of C In-Reply-To: References: <8a2e9b1b-8890-a783-5b53-c8480c070f2e@telegraphics.com.au> <9e5933a166ece32b4fb17c6bbb563873@firemail.de> Message-ID: An HTML attachment was scrubbed... URL: From woods at robohack.ca Wed May 27 05:50:46 2020 From: woods at robohack.ca (Greg A. Woods) Date: Tue, 26 May 2020 12:50:46 -0700 Subject: [TUHS] History of popularity of C In-Reply-To: References: <8a2e9b1b-8890-a783-5b53-c8480c070f2e@telegraphics.com.au> Message-ID: At Tue, 26 May 2020 10:32:43 -0400, Clem Cole wrote: Subject: Re: [TUHS] History of popularity of C > > Dave that was exactly my point. Pascal was designed as a teaching > language so Wirth did not put things into the language that made it helpful > as a production language. So everyone else tried and the language became > a mess. Everybody peed on it. Dennis' quote: “When I read commentary > about suggestions for where C should go, I often think back and give thanks > that it wasn't developed under the advice of a worldwide crowd.” > And that's exactly what's wrong with C now -- except it's probably even a bit worse for C as the majority of people who have been sitting on the C standards committees for the past decades are primarily either those with deeply funded agendas about how they think they can make more money with the language if only it behaves a certain way (e.g. more like C++); and/or a few academic compiler and optimizer experts who have strong ideas about how they can eek the tiniest gains from their compilers if only the spec says certain things. UB (undefined behaviour), for example, should be stricken from the standard completely and forever. Every behaviour MUST be defined, either by the implementation (with NO recourse for or fallback to UB), or, strictly defined, by the standard. -- Greg A. Woods Kelowna, BC +1 250 762-7675 RoboHack Planix, Inc. Avoncote Farms -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 195 bytes Desc: OpenPGP Digital Signature URL: From crossd at gmail.com Wed May 27 05:55:39 2020 From: crossd at gmail.com (Dan Cross) Date: Tue, 26 May 2020 15:55:39 -0400 Subject: [TUHS] History of popularity of C In-Reply-To: References: <8a2e9b1b-8890-a783-5b53-c8480c070f2e@telegraphics.com.au> <9e5933a166ece32b4fb17c6bbb563873@firemail.de> Message-ID: Cc: to COFF, as this isn't so Unix-y anymore. On Tue, May 26, 2020 at 12:22 PM Christopher Browne wrote: > [snip] > The Modula family seemed like the better direction; those were still > Pascal-ish, but had nice intentional extensions so that they were not > nearly so "impotent." I recall it being quite popular, once upon a time, > to write code in Modula-2, and run it through a translator to mechanically > transform it into a compatible subset of Ada for those that needed DOD > compatibility. The Modula-2 compilers were wildly smaller and faster for > getting the code working, you'd only run the M2A part once in a while > (probably overnight!) > Wirth's languages (and books!!) are quite nice, and it always surprised and kind of saddened me that Oberon didn't catch on more. Of course Pascal was designed specifically for teaching. I learned it in high school (at the time, it was the language used for the US "AP Computer Science" course), but I was coming from C (with a little FORTRAN sprinkled in) and found it generally annoying; I missed Modula-2, but I thought Oberon was really slick. The default interface (which inspired Plan 9's 'acme') had this neat graphical sorting simulation: one could select different algorithms and vertical bars of varying height were sorted into ascending order to form a rough triangle; one could clearly see the inefficiency of e.g. Bubble sort vs Heapsort. I seem to recall there was a way to set up the (ordinarily randomized) initial conditions to trigger worst-case behavior for quick. I have a vague memory of showing it off in my high school CS class. - Dan C. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jon at fourwinds.com Wed May 27 06:00:27 2020 From: jon at fourwinds.com (Jon Steinhart) Date: Tue, 26 May 2020 13:00:27 -0700 Subject: [TUHS] History of popularity of C In-Reply-To: References: <8a2e9b1b-8890-a783-5b53-c8480c070f2e@telegraphics.com.au> <9e5933a166ece32b4fb17c6bbb563873@firemail.de> Message-ID: <202005262000.04QK0Rh52097995@darkstar.fourwinds.com> Dan Cross writes: > > Of course Pascal was designed specifically for teaching. I learned it in > high school ... I had a different experience; I learned C in high school at BTL and then took my first programming class in college which was Pascal and I kept finding it extremely difficult to use because it was so much less flexible than C. Until I took that class it had never even occurred to me that people would write books about the topic as I had leaned from technical memoranda. There were two books in this class, Wirth's and Fundamental Algorithms. Got Don to sign my copy a few years ago which he said he wouldn't do unless it looked really used. Jon From thomas.paulsen at firemail.de Wed May 27 07:48:43 2020 From: thomas.paulsen at firemail.de (Thomas Paulsen) Date: Tue, 26 May 2020 23:48:43 +0200 Subject: [TUHS] History of popularity of C In-Reply-To: References: <8a2e9b1b-8890-a783-5b53-c8480c070f2e@telegraphics.com.au> Message-ID: >And that's exactly what's wrong with C now -- except it's probably even >a bit worse for C as the majority of people who have been sitting on the >C standards committees for the past decades are primarily either those >with deeply funded agendas about how they think they can make more money >with the language if only it behaves a certain way (e.g. more like C++); they don't play any role, as the C language was defined decades ago. I learned it before the ansi committee came to an end by Turbo C and soon later MS C, and then various *NIX compilers. Recently I written a couple of linux programs using gcc with exactly the same syntax I studied 30 years ago, and it works pretty cool. All these programs are error free performing very fast while having a small memory footprint. For me there is nothing better than C, and I know a lot of languages. From woods at robohack.ca Wed May 27 08:36:31 2020 From: woods at robohack.ca (Greg A. Woods) Date: Tue, 26 May 2020 15:36:31 -0700 Subject: [TUHS] History of popularity of C In-Reply-To: References: <8a2e9b1b-8890-a783-5b53-c8480c070f2e@telegraphics.com.au> Message-ID: At Tue, 26 May 2020 23:48:43 +0200, "Thomas Paulsen" wrote: Subject: Re: [TUHS] History of popularity of C > > they don't play any role, as the C language was defined decades ago. I learned it > before the ansi committee came to an end by Turbo C and soon later MS C, and then > various *NIX compilers. Recently I written a couple of linux programs using gcc > with exactly the same syntax I studied 30 years ago, and it works pretty cool. All > these programs are error free performing very fast while having a small memory > footprint. For me there is nothing better than C, and I know a lot of languages. You might be surprised by just how much C has been changed since, say, C89, or even C90, and how niggly the corner cases can get (i.e. where UB sticks its ugly head). Lots of legacy code is now completely broken, at least with the very latest compilers (especially LLVM, but also GCC). Some far more recently written code has even had important security problems, e.g. one in the Linux kernel. NetBSD has to turn off specific "features" in the newest compilers when building the kernel lest they create a broken and/or insecure system. Some code no longer does what it seems to do unless you're the most careful language lawyer at reading it, Standard in hand, and with years of experience. Some compilers can help, e.g. by inserting illegal instructions anywhere where UB would have otherwise allowed the optimizer to go wild and possibly change things completely, but without such tools, and others such as Valgrind, one can get into a heap-o-trouble with the slightest misstep; and of course these tools only work for user-land code, not bare-metal code such as embedded systems and kernels. -- Greg A. Woods Kelowna, BC +1 250 762-7675 RoboHack Planix, Inc. Avoncote Farms -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 195 bytes Desc: OpenPGP Digital Signature URL: From ron at ronnatalie.com Thu May 28 00:37:53 2020 From: ron at ronnatalie.com (Ronald Natalie) Date: Wed, 27 May 2020 10:37:53 -0400 Subject: [TUHS] History of popularity of C In-Reply-To: References: <8a2e9b1b-8890-a783-5b53-c8480c070f2e@telegraphics.com.au> Message-ID: The large areas of undefined and unspecified behavior has always been an issue in C. It was somewhat acceptable when you were using it as a direct replacement for assembler, but Java and many of other follow-ons endevaored to be more portable/rigourous. Of course, you can write crap code in any language. It didn’t take modern C to do this. On the PDP-11 (at least not in split I/D mode), location zero for example contained a few assembler instructions (p&P6) which you could print out. Split I/D and VAX implementations made this even worse by putting a 0 at location 0. When we moved from the VAX to other processors we had location zero unmapped. For the first time, accessing a null pointer ended up trapping rather than either resulting in a null (or some random data). Eventually, we added a feature to the kernel called “Braindamanged Vax compatibility Mode” that restored the zero to location zero. This was enabled by a field we could poke into the a.out header because this was needed on things we didn’t have source code to (things we did we just fixed). Similar nonsense we found where the order that function args are evaluated was relied upon. The PDP-11, etc… evaluated them right-to-left because that’s how they had to push them on the stack for the call linkage. We had one machine that did that in the opposite order (I considered flipping the compiler behavior anyhow0 and when we got to the RISC architectures, things were passed in registered so the evaluation was less predictable. I already detailed the unportability problem I found where the BSD kernel “converted by union”. The most amusing thing I’d have to say was that one day I got a knock on my office door. One of the sales guys from our sister company wanted to know if I could write some Novell drivers for an encrypting ethernet card they were selling. The documentation for writing the driver was quite detailed but all describing i386 assembler interfaces (and the examples were in assembler). About a week into the project I came to realization that the linkages were all the C subroutine calls for that platform. The caller was C and there was no particular reason why the driver wasn’t also written in C. From clemc at ccc.com Thu May 28 01:09:43 2020 From: clemc at ccc.com (Clem Cole) Date: Wed, 27 May 2020 11:09:43 -0400 Subject: [TUHS] History of popularity of C In-Reply-To: References: <8a2e9b1b-8890-a783-5b53-c8480c070f2e@telegraphics.com.au> Message-ID: Henry Spencer's 10 Commandments for C Programmers On Wed, May 27, 2020 at 10:38 AM Ronald Natalie wrote: > The large areas of undefined and unspecified behavior has always been an > issue in C. It was somewhat acceptable when you were using it as a direct > replacement for assembler, > but Java and many of other follow-ons endevaored to be more > portable/rigourous. Of course, you can write crap code in any language. > > It didn’t take modern C to do this. On the PDP-11 (at least not in split > I/D mode), location zero for example contained a few assembler instructions > (p&P6) which you could print out. > Split I/D and VAX implementations made this even worse by putting a 0 at > location 0. When we moved from the VAX to other processors we had > location zero unmapped. For the > first time, accessing a null pointer ended up trapping rather than either > resulting in a null (or some random data). Eventually, we added a > feature to the kernel called “Braindamanged > Vax compatibility Mode” that restored the zero to location zero. This > was enabled by a field we could poke into the a.out header because this was > needed on things we didn’t have > source code to (things we did we just fixed). > > Similar nonsense we found where the order that function args are evaluated > was relied upon. The PDP-11, etc… evaluated them right-to-left because > that’s how they had to push them > on the stack for the call linkage. We had one machine that did that in > the opposite order (I considered flipping the compiler behavior anyhow0 and > when we got to the RISC architectures, > things were passed in registered so the evaluation was less predictable. > > I already detailed the unportability problem I found where the BSD kernel > “converted by union”. > > The most amusing thing I’d have to say was that one day I got a knock on > my office door. One of the sales guys from our sister company wanted to > know if I could write some Novell > drivers for an encrypting ethernet card they were selling. The > documentation for writing the driver was quite detailed but all describing > i386 assembler interfaces (and the examples > were in assembler). About a week into the project I came to realization > that the linkages were all the C subroutine calls for that platform. The > caller was C and there was no particular > reason why the driver wasn’t also written in C. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From thomas.paulsen at firemail.de Thu May 28 02:11:33 2020 From: thomas.paulsen at firemail.de (Thomas Paulsen) Date: Wed, 27 May 2020 18:11:33 +0200 Subject: [TUHS] History of popularity of C In-Reply-To: References: <8a2e9b1b-8890-a783-5b53-c8480c070f2e@telegraphics.com.au> Message-ID: <95e6e8de901c837a28b84e62556ba326@firemail.de> >The large areas of undefined and unspecified behavior has always been an >issue in C. It was somewhat acceptable when you were using it as a direct >replacement for assembler, but Java and many of other follow-ons endevaored >to be more portable/rigourous. One cannot compare system and business related stuff! When I'm doing C I always have the CPU and its instructions in mind. As Linus I see the assembly code in my inner eyes. For such minds, doing with C what earlier was done with assembly, C was created, whereas writing business applications cobol and its modern relative java are the first choices. From woods at robohack.ca Thu May 28 05:49:25 2020 From: woods at robohack.ca (Greg A. Woods) Date: Wed, 27 May 2020 12:49:25 -0700 Subject: [TUHS] History of popularity of C In-Reply-To: <95e6e8de901c837a28b84e62556ba326@firemail.de> References: <8a2e9b1b-8890-a783-5b53-c8480c070f2e@telegraphics.com.au> <95e6e8de901c837a28b84e62556ba326@firemail.de> Message-ID: At Wed, 27 May 2020 18:11:33 +0200, "Thomas Paulsen" wrote: Subject: Re: [TUHS] History of popularity of C > > When I'm doing C I always have the CPU and its instructions in mind. And that's exactly what might trip you up unless you _exactly_ understand how the language standard defines the operations of the abstract virtual machine (right down to the implications of every sequence point in the code); how compilers and optimizers do and (more importantly) do not work when mapping the abstract virtual machine operations into real-world machine instructions; and what how _all_ instances of "undefined behaviour" can arise, and exactly what the optimizer is allowed to do when and if it spots UB conditions in the code. A big part of the problem is that the C Standard mandates compilation will and must succeed (and allows this success to be totally silent too) even if the code contains instances of undefined behaviour. This means that the successful execution of the generated code may depend on what optimization level was chosen. Code that does security tests on input values might be entirely and silently eliminated by the optimizer because of some innocuous-seeming UB instance, and this is exactly what has happened in the Linux kernel, for example (probably more than once). UB can be introduced quite innocently just by moving sequence points in variable references in ways that are not necessarily obvious even to seasoned programmers (and indeed "seasoned" programmers are often the ones who's old-fashioned coding habits might lead to introduction of serious problems in such a way). I've found dozens of instances of UB in mature and well tested code, and sometimes only by luck of having chosen the "right" compiler and enabled its feature of introducing illegal instructions in places where UB might occur, _and_ having had the luck to test in such a way as to encounter the specific code path where this UB occurred. I would claim it's truly safer now to write C without understanding the underlying mechanics of the CPU and memory, but rather by just paying very close attention to the detailed semantics of the language, understanding only the abstract virtual C machine, and hoping your compiler will at least warn if anything even remotely suspicious is done in your code; and lastly (but perhaps most importantly) avoiding like the plague any coding constructs which might make UB harder to spot (e.g. never ever initialize local variables with their definition when pointers are involved). Unfortunately the new "most advanced" C compilers also make it quite a bit more difficult for those of us writing C code that must have specific actions on the bare metal hardware, e.g. in embedded systems, kernels, hardware drivers, etc.; including especially where UB detection tools are far more difficult to use. -- Greg A. Woods Kelowna, BC +1 250 762-7675 RoboHack Planix, Inc. Avoncote Farms -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 195 bytes Desc: OpenPGP Digital Signature URL: From lm at mcvoy.com Thu May 28 06:13:47 2020 From: lm at mcvoy.com (Larry McVoy) Date: Wed, 27 May 2020 13:13:47 -0700 Subject: [TUHS] History of popularity of C In-Reply-To: References: <95e6e8de901c837a28b84e62556ba326@firemail.de> Message-ID: <20200527201347.GY22882@mcvoy.com> So I may have just gotten lucky in my 30+ years of writing C code but I have yet to hit a single instance of this doom and gloom. On Wed, May 27, 2020 at 12:49:25PM -0700, Greg A. Woods wrote: > At Wed, 27 May 2020 18:11:33 +0200, "Thomas Paulsen" wrote: > Subject: Re: [TUHS] History of popularity of C > > > > When I'm doing C I always have the CPU and its instructions in mind. > > And that's exactly what might trip you up unless you _exactly_ > understand how the language standard defines the operations of the > abstract virtual machine (right down to the implications of every > sequence point in the code); how compilers and optimizers do and (more > importantly) do not work when mapping the abstract virtual machine > operations into real-world machine instructions; and what how _all_ > instances of "undefined behaviour" can arise, and exactly what the > optimizer is allowed to do when and if it spots UB conditions in the > code. > > A big part of the problem is that the C Standard mandates compilation > will and must succeed (and allows this success to be totally silent too) > even if the code contains instances of undefined behaviour. This means > that the successful execution of the generated code may depend on what > optimization level was chosen. Code that does security tests on input > values might be entirely and silently eliminated by the optimizer > because of some innocuous-seeming UB instance, and this is exactly what > has happened in the Linux kernel, for example (probably more than once). > > UB can be introduced quite innocently just by moving sequence points in > variable references in ways that are not necessarily obvious even to > seasoned programmers (and indeed "seasoned" programmers are often the > ones who's old-fashioned coding habits might lead to introduction of > serious problems in such a way). > > I've found dozens of instances of UB in mature and well tested code, and > sometimes only by luck of having chosen the "right" compiler and enabled > its feature of introducing illegal instructions in places where UB might > occur, _and_ having had the luck to test in such a way as to encounter > the specific code path where this UB occurred. > > I would claim it's truly safer now to write C without understanding the > underlying mechanics of the CPU and memory, but rather by just paying > very close attention to the detailed semantics of the language, > understanding only the abstract virtual C machine, and hoping your > compiler will at least warn if anything even remotely suspicious is done > in your code; and lastly (but perhaps most importantly) avoiding like > the plague any coding constructs which might make UB harder to spot > (e.g. never ever initialize local variables with their definition when > pointers are involved). > > Unfortunately the new "most advanced" C compilers also make it quite a > bit more difficult for those of us writing C code that must have > specific actions on the bare metal hardware, e.g. in embedded systems, > kernels, hardware drivers, etc.; including especially where UB detection > tools are far more difficult to use. > > -- > Greg A. Woods > > Kelowna, BC +1 250 762-7675 RoboHack > Planix, Inc. Avoncote Farms -- --- Larry McVoy lm at mcvoy.com http://www.mcvoy.com/lm From rich.salz at gmail.com Thu May 28 06:23:04 2020 From: rich.salz at gmail.com (Richard Salz) Date: Wed, 27 May 2020 16:23:04 -0400 Subject: [TUHS] History of popularity of C In-Reply-To: <20200527201347.GY22882@mcvoy.com> References: <95e6e8de901c837a28b84e62556ba326@firemail.de> <20200527201347.GY22882@mcvoy.com> Message-ID: Places where I've seen it, crypto code in OpenSSL. Trying to zero-ize key material, the compiler sees that "char key[]" isn't used any more, and optimizes-away the memcmp. Trying to do constant-time math. -------------- next part -------------- An HTML attachment was scrubbed... URL: From nliber at gmail.com Thu May 28 07:00:57 2020 From: nliber at gmail.com (Nevin Liber) Date: Wed, 27 May 2020 16:00:57 -0500 Subject: [TUHS] History of popularity of C In-Reply-To: References: <8a2e9b1b-8890-a783-5b53-c8480c070f2e@telegraphics.com.au> <95e6e8de901c837a28b84e62556ba326@firemail.de> Message-ID: On Wed, May 27, 2020 at 2:50 PM Greg A. Woods wrote: > A big part of the problem is that the C Standard mandates compilation > will and must succeed (and allows this success to be totally silent too) > even if the code contains instances of undefined behaviour. No it does not. To quote C11: undefined behavior behavior, upon use of a nonportable or erroneous program construct or of erroneous data, for which this International Standard imposes no requirements NOTE Possible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message). Much UB cannot be detected at compile time. Much UB is too expensive to detect at run time. Take strlen(const char* s) for example. s must be a valid pointer that points to a '\0'-terminated string. How would you detect that at compile time? How would you set up your run time to detect that and error out? How would you design your codegen and runtime to detect and error out when UB is invoked in this code: #include #include void A(const char* a, const char* b) { printf("%zu %zu\n", strlen(a), strlen(b)); } // Separate compilation unit int main() { const char a[] = {'A'}; const char b[] = {'\0'}; A(a, b); } -- Nevin ":-)" Liber iber at gmail.com> +1-847-691-1404 -------------- next part -------------- An HTML attachment was scrubbed... URL: From clemc at ccc.com Thu May 28 07:42:06 2020 From: clemc at ccc.com (Clem Cole) Date: Wed, 27 May 2020 17:42:06 -0400 Subject: [TUHS] Sad Times - hopefully this is just temporary Message-ID: https://www.livingcomputers.org/Closure.aspx -------------- next part -------------- An HTML attachment was scrubbed... URL: From woods at robohack.ca Thu May 28 09:17:06 2020 From: woods at robohack.ca (Greg A. Woods) Date: Wed, 27 May 2020 16:17:06 -0700 Subject: [TUHS] History of popularity of C In-Reply-To: References: <8a2e9b1b-8890-a783-5b53-c8480c070f2e@telegraphics.com.au> <95e6e8de901c837a28b84e62556ba326@firemail.de> Message-ID: At Wed, 27 May 2020 16:00:57 -0500, Nevin Liber wrote: Subject: Re: [TUHS] History of popularity of C > > On Wed, May 27, 2020 at 2:50 PM Greg A. Woods wrote: > > > > A big part of the problem is that the C Standard mandates compilation > > will and must succeed (and allows this success to be totally silent too) > > even if the code contains instances of undefined behaviour. > > No it does not. > > To quote C11: > > undefined behavior > behavior, upon use of a nonportable or erroneous program construct or of > erroneous data, for which this International Standard imposes no > requirements Sorry, I concede. Yes, "no requirements". In C99 at least. Sadly most compilers, including GCC and Clang/LLVM will, at best, warn (and warnings are only treated as errors by the most macho|wise); and compilers only do that now because they've been getting flack from developers whenever the optimizer does something unexpected. > Much UB cannot be detected at compile time. Much UB is too expensive to > detect at run time. Indeed. At best you can get a warning, or optional runtime code to abort the program. Now this isn't a problem when "undefined behaviour" becomes "implementation defined behaviour" for a given implementation. However that's not portable obviously, except for the trivial cases where the common compilers for a given type of platform all do the same things. The real problems though arise when the optimizer takes advantage of these rules regardless of what the un-optimized code will do on any given platform and architecture. The Linux kernel example I've referred to involved dereferencing a pointer to do an assignment in a local variable definition, then a few lines later testing if the pointer was NULL before using the local variable. Unoptimised the code will dereference a NULL pointer and load junk from location zero into the variable (because it's kernel code), then the NULL test will trigger and all will be good. The optimizer rips out the NULL check because "obviously" the programmer has assumed the pointer is always a valid non-NULL pointer since they've explicitly dereferenced it before checking it and they wouldn't want to waste even a single jump-on-zero instruction checking it again. (It's also quite possible the code was written "correctly" at first, then someone mushed all the variable initialisations up onto their definitions.) In any case there's now a GCC option: -fno-delete-null-pointer-checks (to go along with -fno-strict-aliasing and -fno-strict-overflow, and -fno-strict-enums, all of which MUST be used, and sometimes -fno-strict-volatile-bitfields too, on all legacy code that you don't want to break) It's even worse when you have to write bare-metal code that must explictly dereference a NULL pointer (a not-so-real example: you want to use location zero in the CPU zero-page (e.g. on a 6502 or 6800, or PDP-8, etc.) as a pointer) -- it is now impossible to do that in strict Standard C even though trivially it "should just work" despite the silly rules. As far as I can tell it always did just work in "plain old" C. The crazy thing about modern optimizers is that they're way more persistent and often somewhat more clever than your average programmer. They follow all the paths. They apply all the rules at every turn. > Take strlen(const char* s) for example. s must be a valid pointer that > points to a '\0'-terminated string. How would you detect that at compile > time? How would you set up your run time to detect that and error out? My premise is that you shouldn't try to detect this problem, AND in any case where the optimizer might be able to prove the pointed at object isn't a valid string it should not, and must not, abuse that knowledge to rip out code or cause other even worse mis-behaviour. I.e. this should not be "undefined", but rather "implementation defined and without any recourse to allowing optimizer abuses". -- Greg A. Woods Kelowna, BC +1 250 762-7675 RoboHack Planix, Inc. Avoncote Farms -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 195 bytes Desc: OpenPGP Digital Signature URL: From meillo at marmaro.de Thu May 28 22:34:37 2020 From: meillo at marmaro.de (markus schnalke) Date: Thu, 28 May 2020 14:34:37 +0200 Subject: [TUHS] fmt(1): history, POSIX, -t, -c Message-ID: <1jeHk5-5LM-00@marmaro.de> Hoi, personally I use fmt(1) a lot for email formatting and such. Typically I only use the `-w' parameter. Now someone asked me about `-t' and `-c' of *GNU* fmt(1). I wasn't able to find good documentation on them. The manpage only tells that they have to do with different indentation for the first or first two lines. But what are the use cases? How would source text for these parameters look like? A look into the description and rationale sections of POSIX, which often provides helpful information, was not possible because fmt(1) is not part of POSIX (only fold(1) is). Why's that? Is it because fmt(1) differs so much between Unix implementations? On BSD `-c' centers text and `-t' sets tab widths. Plan 9 has none of these options. But still, `-w' could have been standardized. Or was the line filling algorithm different as well? How does fold(1) fit into the picture? Maybe you can answer some of these questions or give hints on where I could find answers myself. meillo From robpike at gmail.com Thu May 28 23:08:39 2020 From: robpike at gmail.com (Rob Pike) Date: Thu, 28 May 2020 23:08:39 +1000 Subject: [TUHS] fmt(1): history, POSIX, -t, -c In-Reply-To: <1jeHk5-5LM-00@marmaro.de> References: <1jeHk5-5LM-00@marmaro.de> Message-ID: I looked in my manuals. Fmt(1) first appears in Research 9th edition. I have vague memories that it was written by Tom Duff, but a) I could misremember and b) I also have vague memories it was not original. If both memories are accurate, it's just a simple command written in two different places, one being a distorted echo of another. Much like the make td wrote at UofT after hearing about Stu's. Nothing nefarious. -rob On Thu, May 28, 2020 at 10:41 PM markus schnalke wrote: > Hoi, > > personally I use fmt(1) a lot for email formatting and such. > Typically I only use the `-w' parameter. Now someone asked me about > `-t' and `-c' of *GNU* fmt(1). I wasn't able to find good documentation > on them. The manpage only tells that they have to do with different > indentation for the first or first two lines. But what are the use > cases? How would source text for these parameters look like? > > A look into the description and rationale sections of POSIX, which > often provides helpful information, was not possible because fmt(1) > is not part of POSIX (only fold(1) is). Why's that? Is it because > fmt(1) differs so much between Unix implementations? On BSD `-c' > centers text and `-t' sets tab widths. Plan 9 has none of these > options. But still, `-w' could have been standardized. Or was the > line filling algorithm different as well? How does fold(1) fit into > the picture? > > Maybe you can answer some of these questions or give hints on where > I could find answers myself. > > > meillo > -------------- next part -------------- An HTML attachment was scrubbed... URL: From clemc at ccc.com Thu May 28 23:30:09 2020 From: clemc at ccc.com (Clem Cole) Date: Thu, 28 May 2020 09:30:09 -0400 Subject: [TUHS] fmt(1): history, POSIX, -t, -c In-Reply-To: <1jeHk5-5LM-00@marmaro.de> References: <1jeHk5-5LM-00@marmaro.de> Message-ID: fmt was originally written by Kurt Shoens at UCB when he worked in Mail and delivermail. begin 644 fmt.tar.bz2 M0EIH.3%!62936:*IABH`#=1_U/TP`8!_____________W_]2B`0```(""&`: M7W@`/IIH`&@T!-$9$R:3U M%/4VH>4;4](PT0-&)H`-```````U,DRIZ!#U!M1D>B-#$;2&F(9-```80#0` M:,)H"32B0*;*:GD$G@*'Z*>F3(U#U-J;1#TAZAH#)Z at T!ZFC1Z@`<```T#0T M-#3(`-``!H!H:``!D``"1$$"`$9"&(TDS30IZ,4;4;4Q,&H, at 8F@-&F@&RGY M1?G$_-^6''G7ZU]8;A,V`VT\0%_TFD"/3Y8^MQV#;T(R3^!&;`F`*((#98)( M4KO"=V(3&($G$>RP@@PD*Q6("451L[MJL5D1"**K$6`J,(@Q8B"+$8K$4!@_ M;M`VE:!8)HAN9R\O#WWVMVS1SG`9$5`Q2B`H+[:-E4,%OIM%%1C$0,,* [ctcole-mac09:bsd-sources/usr.bin/fmt] ctcole% wc -l *uu 129 fmt.tar.bz2.uu [ctcole-mac09:bsd-sources/usr.bin/fmt] ctcole% clear [ctcole-mac09:bsd-sources/usr.bin/fmt] ctcole% cat *uu begin 644 fmt.tar.bz2 M0EIH.3%!62936:*IABH`#=1_U/TP`8!_____________W_]2B`0```(""&`: M7W@`/IIH`&@T!-$9$R:3U M%/4VH>4;4](PT0-&)H`-```````U,DRIZ!#U!M1D>B-#$;2&F(9-```80#0` M:,)H"32B0*;*:GD$G@*'Z*>F3(U#U-J;1#TAZAH#)Z at T!ZFC1Z@`<```T#0T M-#3(`-``!H!H:``!D``"1$$"`$9"&(TDS30IZ,4;4;4Q,&H, at 8F@-&F@&RGY M1?G$_-^6''G7ZU]8;A,V`VT\0%_TFD"/3Y8^MQV#;T(R3^!&;`F`*((#98)( M4KO"=V(3&($G$>RP@@PD*Q6("451L[MJL5D1"**K$6`J,(@Q8B"+$8K$4!@_ M;M`VE:!8)HAN9R\O#WWVMVS1SG`9$5`Q2B`H+[:-E4,%OIM%%1C$0,,* M6F%<6T%F^RQQ2L8`Q-::(N!!!6T"BD8CNGT_9]7/./^RH,2SP9SG(VU<+B7# M?263G$UI\F_!TUCIMVWANX6XIF7'BKBXI0U8R9IEQ3"N"AC%5+FRI%/4_LWP MXOW?SWG!=9P3@;Z5LX-=^<#>!7#BPMQIBY-$#&:#%'%JDT,Z+>1T'?DV,\WP7]:WQ$PF`9A(%,A3ZC`X2"%9$ M5IK!-,8M!GW6PV3(L-F(SBX+8/7.OF4V5#[,^4XWG:,RYYR[?IWY_E;ER1>R M#!5&N&I^VQ];,DF+YW8[(FI!$L at MYSH&)9=9VF6UT63U636+:N8T`Y-73*TP MP at C@SDC"9>0+W2CJ)"@\QC3$AQT at 1%`PH,FZHPZ06];I24(Q/-Z^IYVW0DGN<_M*JJHZ at G=8<_O@[*%XSQ(=O9>A=OM;&,KQ6IJ M^QOF-Y;'43G4P0>@JHBH)V:"111K!A.H?C-OO>SVGNGHP,5V/(8Q]BTH$TB! MTTE2T1>%8<,U\7=AV52;$@45%%@H*J,!05$`6"Q$;.2@>PP9GG'%2VG4[.)5 M(9`&AXFL6:G,A4/A&HK2.WSY;)S\WR#V;0HR6N@=(9BHWDFY4C3&(M3-'#:& MO"&JNK=E/7;ZMZB0)!\.7,6M?TP_4I^9K'8A`W,=)FN\='S*#12(H<#B-H9+ M8&H(5LZ(SR[#.VQF!I at 93@^S,5Y%F]DD9Y&,FIA"FZL;":\Y%M130CGDC8VP M\YH,^@[ISG@("NOQ`QEPJX!SN']SEH5]HS)',=L+8I`T+8HXVTL>D]QB;HK( MR[,:#&D-.0[FE,,Q=+87`Z&O7>#9YR%8L*`.I6!<\NIE/F8#MT`ZJAG*:H<<=@NR2A8BF5&\(D:F]XO,:C(V0\P=]QE M,(+U%QI7:A.:,6%R:2':+7#8)G@:6ZCB@`\OL3RGE at 2&!N!YC,"+0`X%\<.- MPW!,T"+O%,"*$FAE@!F$%II+9)WPB!B$AD=Z3#%2>Y91'.5_.Y`8O.V>$][" MD]\8OBYXN(*!2P*4*7X&^\DKIH+3%_=QX,EKIKL.E#`?!["%WE8 at H6J[%['! MTW-VBU"]\$IZ>FY*O=3O[*UIUSKJ at L'!REJ#Q62A$KF$)&125M&%KF0U"[19 MI['6C'&Z\).5`:F9ACPOWC&THKVF#;4K6[;:H43&A#:EFNTQW'>(&_E8YYNP MNM at 870R.706,,N$,-<$`K?G(:C$T(+6SC7%Y)-F%=L`XJ'-8#M,\V3C=BD=^ MR9J-E"M".$-=HU>6WQBVF.%9H[V)H4,!F( M5L.?*)(9\.%9H@#RJ+(B*HA>+#%VM%%13G*JQ*:N`9Y'PR].Z3FAB^PW*[B`IZPS`K8(Y2*D,%'=).4#? M-JB*J)"NYO5QV=T$].]J-I40#:J@*+^R2WR5>Z$68QH59(2S:,XAL73W;RO%Q5XILHZ`7I3$HB&>9Y,97.U,'B,/>E> MJ;A=4[Y+AX=)H#R.B31&7U6QOEOI3G_B7%\.PN#"G?^*XLLW&>JM/PR(^JKH M.O:Q#U39R^3(^/8O5SYLKH3QD6EO"UB7+- at QZF.0<9I0)XRZG37C$LN26)N: M<"#HFV-:-)#IXBUK2_*W6G,0_)("@HC&2"(`@K%5$%BR+)%D$83?WBPD+S6X M;&%+*DK\=[XEQ`KG3!@<%AQ!$9.U.*(/(\J+>^>V)<_+.BQ$M_3"S\GXZ^^G M>7[6]4&M;8[=_TA\Y[@=LZ3C%-1K%.H>.BPX-CHV+[744WH^2-@[?9AX& M85SF58;FXM:SPA(TSIEY`YE8 at JJHFU+4*QMB>Q&=,(&*`A.*^^Y at O=#0IINS M7GTM;NL at Q+!NV\DA&3\4DU=&F%DL<:S26PTP3$=X*F""DMGJ3]AJEWA[0X1R M"_;QF('DIL*\>[Z_OA^!B#!@!^!B7'KIBI'[%K/M7E=S5OD0D?,A9V$H/I:@ M:^VUH?3NWM(^]S'T^A(["GBNFXS7V0%_?8O(GEB5J)_92N+XZQ@`M!,!;A?W M"_!;2% MM5$345#C^?%99#S-##.EEH9$+"2155FLE>7LM'5B2-+#B,5RV+O96*S;(1#C MP,;XXKB#"&2L+6!?JO62T at T&I8H)=-U=+F,9%TRI@*TBP*H*L51471:S*53' MG at +%I&BR*D0(Q6K&#&*U'@6\Z'HO22TY^/FYM6]I.>I=KD9XE]^E1IJSKV"J MC\22.G:<"=4*1D&[G66]Y*AQV:+WU<.-AP&6#!C;'@&%6S*VXH^])-K80P17 MJI'-66ZP54'.KQ2,Q,[UE&DGU-2-;MH.E9,C6VS(0P&H*=N\95Q4D=K,B)TR M,3N'Q4*1 at E"M*16WT?'88!]#I$6:,A*<,"&BC%]A>HXBZ+4IJ(*ZR$8L#]/3 MN]/]$:T0@/.'8)E&GJ#K'`\SC^L]`=#0'$8,`B$^8$0]!@]1L,?*:MYK/-@# MR1FB>@XQ)\QB:@:5`8?A3NRNTJ0;#U(/["I`Q6`U36Y8,R#'@6N9EFF)U#/TEC at 7`@*'N'40$G6,P#>&A-0W#/R' M[3(V)7),CB5#>&LS+$%S44)+&H-AEK#66,PZ#1R6`T!'6R!HV)0$+\>($]9< MZ@@98+V2C at MRV#_X&&\D@`UD(("S9\2N*2A8W!R&4(*%0D+D'ZCH!Q0;2 at 5! M=P,,PZA08K244H8`R36:BQ7F&H,3,#)-(89`&A*8:S>=AM.1S_Y,1G_:@5@[ M2`L`TC(Q at C(5$$$G['VI#080,>;(X8#!0H;J_S3N4/E2((Q1AYW.'`2@,DZB M!%$I/T80D8%`N,L,@)%`4D#U0?P0B.GI,PY'F,BQDQ)J2X/BD:5&=0&!EYNH M+`&B69S.X9<%4-#`V(G-D:SS"L4/(8>3-^[>2'U?5]M$)QR3D@ M4D0F\ZR9"B7$6!8YEPJ*1E!]+!@=`96YK!1A(-!@T)T"BH0)4^"8ZF%/CUTH0-OR/G@>G;I@%F?)$_&'EWN48??T.EX5]WX15513WS0]AF;?S:%X#D&[V!Y^XZM.#51+BTB1>N"F"`P%8$KFQ_Y M>0OUD*BJ*B284\"]X=@4_%IUL>8J4+5\(B;E9=?D4.YNQV&VSD=O/H0W!RL& M=UT at V'[;G96Q=8&JU.VGA+SN=[9O4(6PXFSHS'KEQ[:=TD#>[S2"R2:.#$:D M.C_)&L/!R]MQ)^5O05$748CK",-8>-VDMC7I>1O")*7LUY9+UM)W5R[&-]![ M"D*<.(R==.)`RIO)46"#W9(IFJR/!>*,C;($P$,-@F8X<"< M1/)B4S-@=A.8>W`.YK_8$&(2=AB0N]-5Q'DQEU=J;P15Z5N*@QAE5**A9=:["Y8P M"+%GDE,CW(+&V_NJ7 at FACBHE[\>#/4;%!8#\VV\HH8]ICE%:.LN`EJ#XJ$T9 M:X0%J#@;+XP@^":O'(NM9<;(L^O+(FHTZM(Q at G'45`V1L-FA=E at +RA3(TF$D MW-*YQ5"D(Z8L]'UKE>5[G9S%5(E.;K4L*;0/?!(4A.TDAWY9A*-HFN)I\3#^ M1:@"%>!%$Y1UKB*YQ`$Q'^9#9KP1TJ!K]X)HT%=%CMV'?37AP.D[S80/A(8T,V^L/$@@JDE7B+ MO-.(T:!@UJ6Z:'(U=K!;7@,4'(<:H(P="V=][A46_!%PH?.B*!(]_HY5U`:A MA@)ICH-H5,0RYA["2&TA7&`.C.4A0.>5K^OYL"!06)L)1".Y[.UZMM'O!E#> M-U1T1Z4`=/3*AH7:,I\Y:9:1)$#(@8-6K@/`DNQ8TR at R,*KH(]\T#!7V,K(0 M#`:N(9^C*%:#Z?K^F(B?OYETB4FE=F3&P*(H0CV&8J^"7``7MI;F!_00+2A65)2&2SE5 M-R6I?>ZF46XZ]%H;E*#U.KY.46%TE#;(W%,2L\PW:0C6TV#:;25EO M?#`Q8B;<::Q&X:74L4%1&9M,-J6J4[ MI:`?&65NF.`=>#&AI/JEPLX\FK-51-E-DL'L%7*O$U_, at A0P(38W?!DX[DMB)!;6 M at .U#`U-?.T#82Y3%@N*@7-ADGBLOE+DMN%QX$%"$#6\O%SDJ")"7:]19`HK8 M?9>GPHY&8IH]%[=2ZLJ>JAW#X,>:@&F'&2&T>HS2T#$5#(H0.0H$@\"&'CK) M%>"P.*F]'J&A55LAA=3C?`>3)UJ%4RF'VLPU:9B)@Q0LPU1%??B@@);0X$,N MJ5D>`NX_>H&!@@9:"1;)D,H(PB,#>9#$PZ2)B0J2$Z9M+$-%2$"*#1"95)+- M%#N*4>H_(E=*4P-Z at J)K&B3`W`S[1RY/T4+I#U8ML"6P*P]9A18B-W68,BMH,G#71K[#5_;>V4(Z`CQ.Q7/2>C4RB6AM$ M(CPY]C2UBH>PS/7]#U)&9\1XH.6**C'KXM,0T#ZO9!U-)I52'R)0I%888=31 MI8SXZ*X:DZ)788QY2'N#H'A&LUT):@RVB1DTN99L`G9H=3(Y%MR[#D:PH:PA M'O9\-`KFFUH:!\>^IV/W84DE**!=;2-$/B&LWB&3EX=JW;+0;)B&R7`O.Q$$ M1+]'IJX<$T:*N1V8)TI*2*3%&4:."#AH/;'P#-3%?M]73$H>EE at R;9SB*%HL M=B/B at D*Y-227U0&K.=C1%?!]6`4]& MO&MJTS$YU5=I)B;0VE"+C!A1OS$B6XR2M88Q,.734!A5!%-K0'1*7-::VP#7 MT-WOXA**)"HA^'!C(T,W-QV[^1D%(Q#3?N1B=<7BY@%"![S"45VEPSY8(6F1 MX6,0H46C/-"7MZMMVQ6:3:5S/2#7?-Q!P)"250;G6$!$!.FS(5!8XK[E`Q26 M%"3.@$"D9&"/+W#(EKFI(B7#(B4X!EZ3!@GWZBNW0;*5NJ8D9&'1P/5<.LH$ ML&9;J-,R8R&89+ at U'<&=L7C1R9VH72&SN,,8-=;9L\4P9AP%\9.MF][#2'KC M#!+.C<;CCG`NKE'9A at .3-";A9H*">T5`--VL:+8G+&DID1=L#TR0*DT at +8VS M3/`1FP;!M-)B)`#6TA&M at +4%2$..TD1LH+8UI8,X8S,'895ILB at XH4%(38U9 MXI*[38.8SX+SY9YAL&`&Y&":#(&@BW=2`N.$^*XS*VL;&GNMC0S1Q%`B/X4@ MUW>#,B7!%M!MB#U*$\-IJ*$P;]'0AUL$,,[(G^'0/'0."=L.\TFY6?48>MS_!^.^71H<4@@8R4$S` M0'\S`\TP,T#FO/U)04+.O;=4"E-.?7C@!?(T%=".M(?U?_?6R52D0%$%35K\ 2KZT3?11`_\7 wrote: > Hoi, > > personally I use fmt(1) a lot for email formatting and such. > Typically I only use the `-w' parameter. Now someone asked me about > `-t' and `-c' of *GNU* fmt(1). I wasn't able to find good documentation > on them. The manpage only tells that they have to do with different > indentation for the first or first two lines. But what are the use > cases? How would source text for these parameters look like? > > A look into the description and rationale sections of POSIX, which > often provides helpful information, was not possible because fmt(1) > is not part of POSIX (only fold(1) is). Why's that? Is it because > fmt(1) differs so much between Unix implementations? On BSD `-c' > centers text and `-t' sets tab widths. Plan 9 has none of these > options. But still, `-w' could have been standardized. Or was the > line filling algorithm different as well? How does fold(1) fit into > the picture? > > Maybe you can answer some of these questions or give hints on where > I could find answers myself. > > > meillo > -------------- next part -------------- An HTML attachment was scrubbed... URL: From clemc at ccc.com Thu May 28 23:46:34 2020 From: clemc at ccc.com (Clem Cole) Date: Thu, 28 May 2020 09:46:34 -0400 Subject: [TUHS] fmt(1): history, POSIX, -t, -c In-Reply-To: <1jeHk5-5LM-00@marmaro.de> References: <1jeHk5-5LM-00@marmaro.de> Message-ID: On Thu, May 28, 2020 at 8:41 AM markus schnalke wrote: > A look into the description and rationale sections of POSIX, which > often provides helpful information, was not possible because fmt(1) > is not part of POSIX (only fold(1) is). Why's that? > It was not in SVID and nobody from the BSD side of the war at the time felt it was worth arguing about to add it to the standard. Basically, during the writing of both POSIX.1 and .2, there was huge pressure from AT&T to just take the SVID and try to make that the standard. In fact, IIRC, Jim Issack got AT&T to release the copyright on it and we used some of the original AT&T troff source. But many of us pushed back saying even if there was a marketing campaign: "AT&T UNIX®, Consider it Standard" it was not hardly so. And many BSD additions (improvements) were taken into the standard. For instance, sockets was the prefered to networking interface, although to save face AT&T managed to get the TLI allowed in as an alternative to sockets in the first version of the network specification. (Funny, I don't know of a FIPS-151 registered UNIX implementation that used TLI). Remember, the primary driver for the POSIX work was for the ISV's - to make it easier for them to create software that they could sell. Early on, Heinz in particular, wanted an ABI, not an API (many of us, myself in that camp) shouted him down. Since those days, I've sometimes wondered if we had earlier on figured out how to do that; maybe the UNIX Wars would have worked out differently (but thats a different discussion). Back to fmt(1), like you, I have used it for years, particularly in email. I usually forked it from vi to paginate my message was what I did for years until I finally switched from mh (actually the hm version) to the Gmail interface as my MUI client. Clem -------------- next part -------------- An HTML attachment was scrubbed... URL: From clemc at ccc.com Thu May 28 23:47:56 2020 From: clemc at ccc.com (Clem Cole) Date: Thu, 28 May 2020 09:47:56 -0400 Subject: [TUHS] fmt(1): history, POSIX, -t, -c In-Reply-To: References: <1jeHk5-5LM-00@marmaro.de> Message-ID: Ouch sorry for the extra stuff -- cut/paste error which I did not realize until after delivery. On Thu, May 28, 2020 at 9:30 AM Clem Cole wrote: > fmt was originally written by Kurt Shoens at UCB when he worked in Mail > and delivermail. > > > begin 644 fmt.tar.bz2 > M0EIH.3%!62936:*IABH`#=1_U/TP`8!_____________W_]2B`0```(""&`: > M7W@`/ MU3"400"&1D)IM$-)^J;1)MHFFIYJ)HVICTI^D3U!H-`>IIH`&@T!-$9$R:3U > M%/4VH>4;4](PT0-&)H`-```````U,DRIZ!#U!M1D>B-#$;2&F(9-```80#0` > M:,)H"32B0*;*:GD$G@*'Z*>F3(U#U-J;1#TAZAH#)Z at T!ZFC1Z@`<```T#0T > M-#3(`-``!H!H:``!D``"1$$"`$9"&(TDS30IZ,4;4;4Q,&H, at 8F@-&F@&RGY > M1?G$_-^6''G7ZU]8;A,V`VT\0%_TFD"/3Y8^MQV#;T(R3^!&;`F`*((#98)( > M4KO"=V(3&($G$>RP@@PD*Q6("451L[MJL5D1"**K$6`J,(@Q8B"+$8K$4!@_ > M;M`VE:!8)HAN9R\O#WWVMVS1SG`9$5`Q2B`H+[:-E4,%OIM%%1C$0,,* > [ctcole-mac09:bsd-sources/usr.bin/fmt] ctcole% wc -l *uu > 129 fmt.tar.bz2.uu > [ctcole-mac09:bsd-sources/usr.bin/fmt] ctcole% clear > [ctcole-mac09:bsd-sources/usr.bin/fmt] ctcole% cat *uu > begin 644 fmt.tar.bz2 > M0EIH.3%!62936:*IABH`#=1_U/TP`8!_____________W_]2B`0```(""&`: > M7W@`/ MU3"400"&1D)IM$-)^J;1)MHFFIYJ)HVICTI^D3U!H-`>IIH`&@T!-$9$R:3U > M%/4VH>4;4](PT0-&)H`-```````U,DRIZ!#U!M1D>B-#$;2&F(9-```80#0` > M:,)H"32B0*;*:GD$G@*'Z*>F3(U#U-J;1#TAZAH#)Z at T!ZFC1Z@`<```T#0T > M-#3(`-``!H!H:``!D``"1$$"`$9"&(TDS30IZ,4;4;4Q,&H, at 8F@-&F@&RGY > M1?G$_-^6''G7ZU]8;A,V`VT\0%_TFD"/3Y8^MQV#;T(R3^!&;`F`*((#98)( > M4KO"=V(3&($G$>RP@@PD*Q6("451L[MJL5D1"**K$6`J,(@Q8B"+$8K$4!@_ > M;M`VE:!8)HAN9R\O#WWVMVS1SG`9$5`Q2B`H+[:-E4,%OIM%%1C$0,,* > M6F%<6T%F^RQQ2L8`Q-::(N!!!6T"BD8CNGT_9]7/./^RH,2SP9SG(VU<+B7# > M?263G$UI\F_!TUCIMVWANX6XIF7'BKBXI0U8R9IEQ3"N"AC%5+FRI%/4_LWP > MXOW?SWG!=9P3@;Z5LX-=^<#>!7#BPMQIBY-$#&:#%'%JDT M[F,F%9.6G83D^ONXSI_+>,Z+>1T'?DV,\WP7]:WQ$PF`9A(%,A3ZC`X2"%9$ > M5IK!-,8M!GW6PV3(L-F(SBX+8/7.OF4V5#[,^4XWG:,RYYR[?IWY_E;ER1>R > M#!5&N&I^VQ];,DF+YW8[(FI!$L at MYSH&)9=9VF6UT63U636+:N8T`Y-73*TP > MP at C@SDC"9>0+W2CJ)"@\QC3$AQT at 1%`PH,FZHPZ06];I M)XV2>24(Q/-Z^IYVW0DGN<_M*JJHZ at G=8<_O@[*%XSQ(=O9>A=OM;&,KQ6IJ > M^QOF-Y;'43G4P0>@JHBH)V:"111K!A.H?C-OO>SVGNGHP,5V/(8Q]BTH$TB! > MTTE2T1>%8<,U\7=AV52;$@45%%@H*J,!05$`6"Q$;.2@>PP9GG'%2VG4[.)5 > M(9`&AXFL6:G,A4/A&HK2.WSY;)S\WR#V;0HR6N@=(9BHWDFY4C3&(M3-'#:& > MO"&JNK=E/7;ZMZB0)!\.7,6M?TP_4I^9K'8A`W,=)FN\='S*#12(H<#B-H9+ > M8&H(5LZ(SR[#.VQF!I at 93@^S,5Y%F]DD9Y&,FIA"FZL;":\Y%M130CGDC8VP > M\YH,^@[ISG@("NOQ`QEPJX!SN']SEH5]HS)',=L+8I`T+8HXVTL>D]QB;HK( > MR[,:#&D M4IYS$^`*)'RC*[KW-U7M-Q[++R9F#\##H'`J MR`4??57 at K"&8.*I>-.0[FE,,Q=+87`Z&O7>#9YR%8L M<.) MY>*`.I6!<\NIE/F8#MT`ZJAG*:H<<=@NR2A8BF5&\(D:F]XO,:C(V0\P=]QE > M,(+U%QI7:A.:,6%R:2':+7#8)G@:6ZCB@`\OL3RGE at 2&!N!YC,"+0`X%\<.- > MPW!,T"+O%,"*$FAE@!F$%II+9)WPB!B$AD=Z3#%2>Y91'.5_.Y`8O.V>$][" > MD]\8OBYXN(*!2P*4*7X&^\DKIH+3%_=QX,EKIKL.E#`?!["%WE8 at H6J[%['! > MTW-VBU"]\$IZ>FY*O=3O[*UIUSKJ at L'!REJ#Q62A$KF$)&125M&%KF0U"[19 > MI['6C'&Z\).5`:F9ACPOWC&THKVF#;4K6[;:H43&A#:EFNTQW'>(&_E8YYNP > MNM at 870R.706,,N$,-<$`K?G(:C$T(+6SC7%Y)-F%=L`XJ'-8#M,\V3C=BD=^ > MR9J-E"M". MD^E:'<=YAMH7684*%X6VZ<+WT$R,,70T5KM?.I60J)63?RB'SNO!:A!#1MI< > MM*BEX-WJ0:D;`V&MV3#'.^B`PRDE^2`]6L>$-=HU>6WQBVF.%9H[V)H4,!F( > M5L.?*)(9\.%9H@#RJ+(B*HA>+#%VM%%13G*JQ*:N` M4 at T-+=9Y'PR].Z3FAB^PW*[B`IZPS`K8(Y2*D,%'=).4#? > M-JB*J)"NYO5QV=T$].]J-I40#:J@*+^R2W M0&Q<($&1$!9I=9YH^,9O/O;)^3*9K,*1ML8EB, M9!>R5>Z$68QH59(2S:,XAL73W;RO%Q5XILHZ`7I3$HB&>9Y,97.U,'B,/>E> > MJ;A=4[Y+AX=)H#R.B31&7U6QOEOI3G_B7%\.PN#"G?^*XLLW&>JM/PR(^JKH > M.O:Q#U39R^3(^/8O5SYLKH3QD6EO"UB7+- at QZF.0<9I0)XRZG37C$LN26)N: > M<"#HFV-:-)#IXBUK2_*W6G,0_)("@HC&2"(`@K%5$%BR+)%D$83?WBPD+S6X > M;&%+*DK\=[XEQ`KG3!@<%AQ!$9.U.*(/(\J+>^>V)<_+.BQ$M_3"S\GXZ^^G > M>7[6]4&M;8[=_TA\Y[@=LZ3C%-1K%.H>.BPX-CHV+[744WH^2-@[?9AX& > M85SF58;FXM:SPA(TSIEY`YE8 at JJHFU+4*QMB>Q&=,(&*`A.*^^Y at O=#0IINS > M7GTM;NL at Q+!NV\DA&3\4DU=&F%DL<:S26PTP3$=X*F""DMGJ3]AJEWA[0X1R > M"_;QF('DIL*\>[Z_OA^!B#!@!^!B7'KIBI'[%K/M7E=S5OD0D?,A9V$H/I:@ > M:^VUH?3NWM(^]S'T^A(["GBNFXS7V0%_?8O(GEB5J)_92N+XZQ@`M!,!;A?W > M"_ MN6+7B;4$.+CJB%`*@^CI!2H5/.N]9I%5D!IO69^^+:5A+8PA:Y[307$>!;2% > MM5$345#C^?%99#S-##.EEH9$+"2155FLE>7LM'5B2-+#B,5RV+O96*S;(1#C > MP,;XXKB#"&2L+6!?JO62T at T&I8H)=-U=+F,9%TRI@*TBP*H*L51471:S*53' > MG at +%I&BR*D0(Q6K&#&*U'@6\Z'HO22TY^/FYM6]I.>I=KD9XE]^E1IJSKV"J > MC\22.G:<"=4*1D&[G66]Y*AQV:+WU<.-AP&6#!C;'@&%6S*VXH^])-K80P17 > MJI'-66ZP54'.KQ2,Q,[UE&DGU-2-;MH.E9,C6VS(0P&H*=N\95Q4D=K,B)TR > M,3N'Q4*1 at E"M*16WT?'88!]#I$6:,A*<,"&BC%]A>HXBZ+4IJ(*ZR$8L#]/3 > MN]/]$:T0@/.'8)E&GJ#K'`\SC^L]`=#0'$8,`B$^8$0]!@]1L,?*:MYK/-@# > MR1FB>@XQ)\QB:@:5`8?A3NRNTJ0;#U(/["I`Q6`U36Y8,R#'@ M,8B!7!Z"H09,P,C*6QY\>6N9EFF)U#/TEC at 7`@*'N'40$G6,P#>&A-0W#/R' > M[3(V)7),CB5#>&LS+$%S44)+&H-AEK#66,PZ#1R6`T!'6R!HV)0$+\>($]9< > MZ@@98+V2C at MRV#_X&&\D@`UD(("S9\2N*2A8W!R&4(*%0D+D'ZCH!Q0;2 at 5! > M=P,,PZA08K244H8`R36:BQ7F&H,3,#)-(89`&A*8:S>=AM.1S_Y,1G_:@5@[ > M2`L`TC(Q at C(5$$$G['VI#080,>;(X8#!0H;J_S3N4/E2((Q1AYW.'`2@,DZB > M!%$I/T80D8%`N,L,@)%`4D#U0?P0B.GI,PY'F,BQDQ)J2X/BD:5&=0&!EYNH > M+`&B69S.X9<%4-#`V(G-D:SS"L4/(8>3-^[>2'U?5] M-84-NEA:@Q"3B8"WV)2H0&!D*#(D)4F"1CY-\[&9@:!@4(*#+BJ,H%#3%!0I > MD9DC!E.H88E"AJYI<["Y9!D,9Z3"`*B."37F9K-ITL&L-HQE1;R`DR"#@BC, > M06HP("]C(8;EO,BP6!DH*B8PH&\&`X*%15,9.Q(2Q2D3Z"H"\30>M$)QR3D@ > M4D0F\ZR9"B7$6!8YEPJ*1E!]+!@=`96YK!1A(-!@T)T"BH0)4^"8ZF MKZ60J>%/CUTH0-OR/G@>G;I@%F?)$_&'EWN48??T.EX5]WX15513WS0] MPPRV'. at JQ^=,H=$S3F15.JJ(]QP`,/].QG5'2Y0/B\TK4&I8V(R.^![)CSD. > M-][78'>AF;?S:%X#D&[V!Y^XZM.#51+BTB1>N"F"`P%8$KFQ_Y > M>0OUD*BJ*B284\"]X=@4_%IUL>8J4+5\(B;E9=?D4.YNQV&VSD=O/H0W!RL& > M=UT at V'[;G96Q=8&JU.VGA+SN=[9O4(6PXFSHS'KEQ[:=TD#>[S2"R2:.#$:D > M.C_)&L/!R]MQ)^5O05$748CK",-8>-VDMC7I>1O")*7LUY9+UM)W5R[&-]![ > M"D*<.(R==.)`RIO)46"#W9(IFJR/!>*,C;($P$,-@F8X<"< > M1/)B4S-@=A.8>W M4*T6/S3$NV,8P650LV8.C@')7#:/U<8Q+2P57PDFC'_X5 at 4T8W)SDHJ8*&R4 > MJKYM:U+PP>`.YK_8$&(2=AB0N]-5Q'DQEU=J;P15Z5N*@QAE5**A9=:["Y8P > M"+%GDE,CW(+&V_NJ7 at FACBHE[\>#/4;%!8#\VV\HH8]ICE%:.LN`EJ#XJ$T9 > M:X0%J#@;+XP@^":O'(NM9<;(L^O+(FHTZM(Q at G'45`V1L-FA=E at +RA3(TF$D > MW-*YQ5"D(Z8L]'UKE>5[G9S%5(E.;K4L*;0/?!(4A.TDAWY9A*-HFN)I\3#^ > M1:@"%>!%$Y1UKB*YQ`$Q'^9#9KP1TJ!K]X) M$Z]@NF1L'"80!T=\E*E`>HT%=%CMV'?37AP.D[S80/A(8T,V^L/$@@JDE7B+ > MO-.(T:!@UJ6Z:'(U=K!;7@,4'(<:H(P="V=][A46_!%PH?.B*!(]_HY5U`:A > MA@)ICH-H5,0RYA["2&TA7&`.C.4A0.>5K^OYL"!06)L)1".Y[.UZMM'O!E#> > M-U1T1Z4`=/3*AH7:,I\Y:9:1)$#(@8-6K@/`DNQ8TR at R,*KH(]\T#!7V,K(0 > M#`:N(9^C*%:#Z?K^F(B?OYETB4FE=F3&P*(H0CV&8J^"7``7MI;F! ME!UL*_00+2A65)2&2SE5 > M-R M^`+BCC@#`VC!10\NZ"0]TZ'I6&07[4+HFDH-B.!![YY(V)&)!YA9[%K$F$KD > M&:^'<*@8A*IQ:*,!C#"-2;GPR_%[`]&XLCL=C M\X\2UG9Q>6I?>ZF46XZ]%H;E*#U.KY.46%TE#;(W%,2L\PW:0C6TV#:;25EO > M?#`Q8B;<::Q&X:74L4%1&9M,-J6J4[ > MI:`?&65NF.`=>#&AI/JEPLX\FK-51-E-DL'L%7*O M6&.*(*=N@;4EDI*7$T!%)Z.)3*<^PWHU&@S>$U_, at A0P(38W?!DX[DMB)!;6 > M at .U#`U-?.T#82Y3%@N*@7-ADGBLOE+DMN%QX$%"$#6\O%SDJ")"7:]19`HK8 > M?9>GPHY&8IH]%[=2ZLJ>JAW#X,>:@&F'&2&T>HS2T#$5#(H0.0H$@\"&'CK) > M%>"P.*F]'J&A55LAA=3C?`>3)UJ%4RF'VLPU:9B)@Q0LPU1%??B@@);0X$,N > MJ5D>`NX_>H&!@@9:"1;)D,H(PB,#>9#$PZ2)B0J2$Z9M+$-%2$"*#1"95)+- > M%#N*4>H_(E=*4P-Z at J)K&B3`W`S[1RY/T4+I#U8ML"6P*P]9A18B-W68,B M;F&0I:2A8:CX!J*KB?MH,G#71K[#5_;>V4(Z`CQ.Q7/2>C4RB6AM$ > M(CPY]C2UBH>PS/7]#U)&9\1XH.6**C'KXM,0T#ZO9!U-)I52'R)0I%888=31 > MI8SXZ*X:DZ)788QY2'N#H'A&LUT):@RVB1DTN99L`G9H=3(Y%MR[#D:PH:PA > M'O9\-`KFFUH:!\>^IV/W84DE**!=;2-$/B&LWB&3EX=JW;+0;)B&R7`O.Q$$ > M1+]'IJX<$T:*N1V8)TI*2*3%&4:."#AH/;'P#-3%?M]73$H>EE at R;9SB*%HL > M=B/B at D*Y-227U0&K.= M8%?@3PS)5%)D*9P9P8P4F;2VL15F+9B)V+=LR/7+(7,9S5'#U`S=0C!%2089 > MM65 at W$/7FKX6VG M,T-&LARITH,.)#4::B-!,MI84#J9*;61AC7WKY4]$R2._T4W>C1%?!]6`4]& > MO&MJTS$YU5=I)B;0VE"+C!A1OS$B6XR2M88Q,.734!A5!%-K0'1*7-::VP#7 > MT-WOXA**)"HA^'!C(T,W-QV[^1D%(Q#3?N1B=<7BY@%"![S"45VEPSY8(6F1 > MX6,0H46C/-"7MZMMVQ6:3:5S/2#7?-Q!P)"250;G6$!$!.FS(5!8XK[E`Q26 > M%"3.@$"D9&"/+W#(EKFI(B7#(B4X!EZ3!@GWZBNW0;*5NJ8D9&'1P/5<.LH$ > ML&9;J-,R8R&89+ at U'<&=L7C1R9VH72&SN,,8-=;9L\4P9AP%\9.MF][#2'KC > M#!+.C<;CCG`NKE'9A at .3-";A9H*">T5`--VL:+8G+&DID1=L#TR0*DT at +8VS > M3/`1FP;!M-)B)`#6TA&M at +4%2$..TD1LH+8UI8,X8S,'895ILB at XH4%(38U9 > MXI*[38.8SX+SY9YAL&`&Y&":#(&@BW=2`N.$^*XS*VL;&GNMC0S1Q%`B/X4@ > MUW>#,B7!%M!MB#U*$\-IJ M+2K8V)L2VVQ%1(Q at PYL!,$)X&'&3JQ.+BS4.K6O$+$.E92YEIH#BHY@?$5$? > MN$T>*$P;]'0AUL$,,[(G^'0/'0."=L.\TFY6?48>MS_!^.^71H<4@@8R4$S` > M0'\S`\TP,T#FO/U)04+.O;=4"E-.?7C@!?(T%=".M(?U?_?6R52D0%$%35K\ > 2KZT3?11`_\7 ` > end > > On Thu, May 28, 2020 at 8:41 AM markus schnalke wrote: > >> Hoi, >> >> personally I use fmt(1) a lot for email formatting and such. >> Typically I only use the `-w' parameter. Now someone asked me about >> `-t' and `-c' of *GNU* fmt(1). I wasn't able to find good documentation >> on them. The manpage only tells that they have to do with different >> indentation for the first or first two lines. But what are the use >> cases? How would source text for these parameters look like? >> >> A look into the description and rationale sections of POSIX, which >> often provides helpful information, was not possible because fmt(1) >> is not part of POSIX (only fold(1) is). Why's that? Is it because >> fmt(1) differs so much between Unix implementations? On BSD `-c' >> centers text and `-t' sets tab widths. Plan 9 has none of these >> options. But still, `-w' could have been standardized. Or was the >> line filling algorithm different as well? How does fold(1) fit into >> the picture? >> >> Maybe you can answer some of these questions or give hints on where >> I could find answers myself. >> >> >> meillo >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mah at mhorton.net Fri May 29 02:08:22 2020 From: mah at mhorton.net (Mary Ann Horton) Date: Thu, 28 May 2020 09:08:22 -0700 Subject: [TUHS] fmt(1): history, POSIX, -t, -c In-Reply-To: References: <1jeHk5-5LM-00@marmaro.de> Message-ID: It's nice to see a uuencode email attachment once again! Clem (who, by the way, is correct about the origin of fmt, it was part of Kurt Shoens' Mail program at Berkeley) has perfect timing. This coming Monday, 6/1/2020, is the 40th anniversary of the uuencode email attachment. (The date is based on the date in the uuencode man page in the 2.8BSD and 4.2BSD archives at https://www.tuhs.org/Archive/Distributions/UCB/ ) There has been some amusing coverage of the "25th anniversary of the email attachment", commemorating Nat Borenstien's creation of MIME in 1992. Any thoughts on a proper commemoration of the 40th anniversary?     Mary Ann On 5/28/20 6:47 AM, Clem Cole wrote: > Ouch sorry for the extra stuff -- cut/paste error which I did not > realize until after delivery. > > On Thu, May 28, 2020 at 9:30 AM Clem Cole > wrote: > > fmt was originally written by Kurt Shoens at UCB when he worked in > Mail and delivermail. > > begin 644 fmt.tar.bz2 > M0EIH.3%!62936:*IABH`#=1_U/TP`8!_____________W_]2B`0```(""&`: > M7W@`/ MU3"400"&1D)IM$-)^J;1)MHFFIYJ)HVICTI^D3U!H-`>IIH`&@T!-$9$R:3U > M%/4VH>4;4](PT0-&)H`-```````U,DRIZ!#U!M1D>B-#$;2&F(9-```80#0` > M:,)H"32B0*;*:GD$G@*'Z*>F3(U#U-J;1#TAZAH#)Z at T!ZFC1Z@`<```T#0T > M-#3(`-``!H!H:``!D``"1$$"`$9"&(TDS30IZ,4;4;4Q,&H, at 8F@-&F@&RGY > M1?G$_-^6''G7ZU]8;A,V`VT\0%_TFD"/3Y8^MQV#;T(R3^!&;`F`*((#98)( > M4KO"=V(3&($G$>RP@@PD*Q6("451L[MJL5D1"**K$6`J,(@Q8B"+$8K$4!@_ > M;M`VE:!8)HAN9R\O#WWVMVS1SG`9$5`Q2B`H+[:-E4,%OIM%%1C$0,,* > M6F%<6T%F^RQQ2L8`Q-::(N!!!6T"BD8CNGT_9]7/./^RH,2SP9SG(VU<+B7# > M?263G$UI\F_!TUCIMVWANX6XIF7'BKBXI0U8R9IEQ3"N"AC%5+FRI%/4_LWP > MXOW?SWG!=9P3@;Z5LX-=^<#>!7#BPMQIBY-$#&:#%'%JDT M[F,F%9.6G83D^ONXSI_+>,Z+>1T'?DV,\WP7]:WQ$PF`9A(%,A3ZC`X2"%9$ > M5IK!-,8M!GW6PV3(L-F(SBX+8/7.OF4V5#[,^4XWG:,RYYR[?IWY_E;ER1>R > M#!5&N&I^VQ];,DF+YW8[(FI!$L at MYSH&)9=9VF6UT63U636+:N8T`Y-73*TP > MP at C@SDC"9>0+W2CJ)"@\QC3$AQT at 1%`PH,FZHPZ06];I M)XV2>24(Q/-Z^IYVW0DGN<_M*JJHZ at G=8<_O@[*%XSQ(=O9>A=OM;&,KQ6IJ > M^QOF-Y;'43G4P0>@JHBH)V:"111K!A.H?C-OO>SVGNGHP,5V/(8Q]BTH$TB! > MTTE2T1>%8<,U\7=AV52;$@45%%@H*J,!05$`6"Q$;.2@>PP9GG'%2VG4[.)5 > M(9`&AXFL6:G,A4/A&HK2.WSY;)S\WR#V;0HR6N@=(9BHWDFY4C3&(M3-'#:& > MO"&JNK=E/7;ZMZB0)!\.7,6M?TP_4I^9K'8A`W,=)FN\='S*#12(H<#B-H9+ > M8&H(5LZ(SR[#.VQF!I at 93@^S,5Y%F]DD9Y&,FIA"FZL;":\Y%M130CGDC8VP > M\YH,^@[ISG@("NOQ`QEPJX!SN']SEH5]HS)',=L+8I`T+8HXVTL>D]QB;HK( > MR[,:#&D M4IYS$^`*)'RC*[KW-U7M-Q[++R9F#\##H'`J MR`4??57 at K"&8.*I>-.0[FE,,Q=+87`Z&O7>#9YR%8L M<.) MY>*`.I6!<\NIE/F8#MT`ZJAG*:H<<=@NR2A8BF5&\(D:F]XO,:C(V0\P=]QE > M,(+U%QI7:A.:,6%R:2':+7#8)G@:6ZCB@`\OL3RGE at 2&!N!YC,"+0`X%\<.- > MPW!,T"+O%,"*$FAE@!F$%II+9)WPB!B$AD=Z3#%2>Y91'.5_.Y`8O.V>$][" > MD]\8OBYXN(*!2P*4*7X&^\DKIH+3%_=QX,EKIKL.E#`?!["%WE8 at H6J[%['! > MTW-VBU"]\$IZ>FY*O=3O[*UIUSKJ at L'!REJ#Q62A$KF$)&125M&%KF0U"[19 > MI['6C'&Z\).5`:F9ACPOWC&THKVF#;4K6[;:H43&A#:EFNTQW'>(&_E8YYNP > MNM at 870R.706,,N$,-<$`K?G(:C$T(+6SC7%Y)-F%=L`XJ'-8#M,\V3C=BD=^ > MR9J-E"M". MD^E:'<=YAMH7684*%X6VZ<+WT$R,,70T5KM?.I60J)63?RB'SNO!:A!#1MI< > MM*BEX-WJ0:D;`V&MV3#'.^B`PRDE^2`]6L>$-=HU>6WQBVF.%9H[V)H4,!F( > M5L.?*)(9\.%9H@#RJ+(B*HA>+#%VM%%13G*JQ*:N` M4 at T-+=9Y'PR].Z3FAB^PW*[B`IZPS`K8(Y2*D,%'=).4#? > M-JB*J)"NYO5QV=T$].]J-I40#:J@*+^R2W M0&Q<($&1$!9I=9YH^,9O/O;)^3*9K,*1ML8EB, M9!>R5>Z$68QH59(2S:,XAL73W;RO%Q5XILHZ`7I3$HB&>9Y,97.U,'B,/>E> > MJ;A=4[Y+AX=)H#R.B31&7U6QOEOI3G_B7%\.PN#"G?^*XLLW&>JM/PR(^JKH > M.O:Q#U39R^3(^/8O5SYLKH3QD6EO"UB7+- at QZF.0<9I0)XRZG37C$LN26)N: > M<"#HFV-:-)#IXBUK2_*W6G,0_)("@HC&2"(`@K%5$%BR+)%D$83?WBPD+S6X > M;&%+*DK\=[XEQ`KG3!@<%AQ!$9.U.*(/(\J+>^>V)<_+.BQ$M_3"S\GXZ^^G > M>7[6]4&M;8[=_TA\Y[@=LZ3C%-1K%.H>.BPX-CHV+[744WH^2-@[?9AX& > M85SF58;FXM:SPA(TSIEY`YE8 at JJHFU+4*QMB>Q&=,(&*`A.*^^Y at O=#0IINS > M7GTM;NL at Q+!NV\DA&3\4DU=&F%DL<:S26PTP3$=X*F""DMGJ3]AJEWA[0X1R > M"_;QF('DIL*\>[Z_OA^!B#!@!^!B7'KIBI'[%K/M7E=S5OD0D?,A9V$H/I:@ > M:^VUH?3NWM(^]S'T^A(["GBNFXS7V0%_?8O(GEB5J)_92N+XZQ@`M!,!;A?W > M"_ MN6+7B;4$.+CJB%`*@^CI!2H5/.N]9I%5D!IO69^^+:5A+8PA:Y[307$>!;2% > MM5$345#C^?%99#S-##.EEH9$+"2155FLE>7LM'5B2-+#B,5RV+O96*S;(1#C > MP,;XXKB#"&2L+6!?JO62T at T&I8H)=-U=+F,9%TRI@*TBP*H*L51471:S*53' > MG at +%I&BR*D0(Q6K&#&*U'@6\Z'HO22TY^/FYM6]I.>I=KD9XE]^E1IJSKV"J > MC\22.G:<"=4*1D&[G66]Y*AQV:+WU<.-AP&6#!C;'@&%6S*VXH^])-K80P17 > MJI'-66ZP54'.KQ2,Q,[UE&DGU-2-;MH.E9,C6VS(0P&H*=N\95Q4D=K,B)TR > M,3N'Q4*1 at E"M*16WT?'88!]#I$6:,A*<,"&BC%]A>HXBZ+4IJ(*ZR$8L#]/3 > MN]/]$:T0@/.'8)E&GJ#K'`\SC^L]`=#0'$8,`B$^8$0]!@]1L,?*:MYK/-@# > MR1FB>@XQ)\QB:@:5`8?A3NRNTJ0;#U(/["I`Q6`U36Y8,R#'@ M,8B!7!Z"H09,P,C*6QY\>6N9EFF)U#/TEC at 7`@*'N'40$G6,P#>&A-0W#/R' > M[3(V)7),CB5#>&LS+$%S44)+&H-AEK#66,PZ#1R6`T!'6R!HV)0$+\>($]9< > MZ@@98+V2C at MRV#_X&&\D@`UD(("S9\2N*2A8W!R&4(*%0D+D'ZCH!Q0;2 at 5! > M=P,,PZA08K244H8`R36:BQ7F&H,3,#)-(89`&A*8:S>=AM.1S_Y,1G_:@5@[ > M2`L`TC(Q at C(5$$$G['VI#080,>;(X8#!0H;J_S3N4/E2((Q1AYW.'`2@,DZB > M!%$I/T80D8%`N,L,@)%`4D#U0?P0B.GI > ,PY'F,BQDQ)J2X/BD:5&=0&!EYNH > M+`&B69S.X9<%4-#`V(G-D:SS"L4/(8>3-^[>2'U?5] M-84-NEA:@Q"3B8"WV)2H0&!D*#(D)4F"1CY-\[&9@:!@4(*#+BJ,H%#3%!0I > MD9DC!E.H88E"AJYI<["Y9!D,9Z3"`*B."37F9K-ITL&L-HQE1;R`DR"#@BC, > M06HP("]C(8;EO,BP6!DH*B8PH&\&`X*%15,9.Q(2Q2D3Z"H"\30>M$)QR3D@ > M4D0F\ZR9"B7$6!8YEPJ*1E!]+!@=`96YK!1A(-!@T)T"BH0)4^"8ZF MKZ60J>%/CUTH0-OR/G@>G;I@%F?)$_&'EWN48??T.EX5]WX15513WS0] MPPRV'. at JQ^=,H=$S3F15.JJ(]QP`,/].QG5'2Y0/B\TK4&I8V(R.^![)CSD. > M-][78'>AF;?S:%X#D&[V!Y^XZM.#51+BTB1>N"F"`P%8$KFQ_Y > M>0OUD*BJ*B284\"]X=@4_%IUL>8J4+5\(B;E9=?D4.YNQV&VSD=O/H0W!RL& > M=UT at V'[;G96Q=8&JU.VGA+SN=[9O4(6PXFSHS'KEQ[:=TD#>[S2"R2:.#$:D > M.C_)&L/!R]MQ)^5O05$748CK",-8>-VDMC7I>1O")*7LUY9+UM)W5R[&-]![ > M"D*<.(R==.)`RIO)46"#W9(IFJR/!>*,C;($P$,-@F8X<"< > M1/)B4S-@=A.8>W M4*T6/S3$NV,8P650LV8.C@')7#:/U<8Q+2P57PDFC'_X5 at 4T8W)SDHJ8*&R4 > MJKYM:U+PP>`.YK_8$&(2=AB0N]-5Q'DQEU=J;P15Z5N*@QAE5**A9=:["Y8P > M"+%GDE,CW(+&V_NJ7 at FACBHE[\>#/4;%!8#\VV\HH8]ICE%:.LN`EJ#XJ$T9 > M:X0%J#@;+XP@^":O'(NM9<;(L^O+(FHTZM(Q at G'45`V1L-FA=E at +RA3(TF$D > MW-*YQ5"D(Z8L]'UKE>5[G9S%5(E.;K4L*;0/?!(4A.TDAWY9A*-HFN)I\3#^ > M1:@"%>!%$Y1UKB*YQ`$Q'^9#9KP1TJ!K]X) M$Z]@NF1L'"80!T=\E*E`>HT%=%CMV'?37AP.D[S80/A(8T,V^L/$@@JDE7B+ > MO-.(T:!@UJ6Z:'(U=K!;7@,4'(<:H(P="V=][A46_!%PH?.B*!(]_HY5U`:A > MA@)ICH-H5,0RYA["2&TA7&`.C.4A0.>5K^OYL"!06)L)1".Y[.UZMM'O!E#> > M-U1T1Z4`=/3*AH7:,I\Y:9:1)$#(@8-6K@/`DNQ8TR at R,*KH(]\T#!7V,K(0 > M#`:N(9^C*%:#Z?K^F(B?OYETB4FE=F3&P*(H0CV&8J^"7``7MI;F! ME!UL*_00+2A65)2&2SE5 > M-R M^`+BCC@#`VC!10\NZ"0]TZ'I6&07[4+HFDH-B.!![YY(V)&)!YA9[%K$F$KD > M&:^'<*@8A*IQ:*,!C#"-2;GPR_%[`]&XLCL=C M\X\2UG9Q>6I?>ZF46XZ]%H;E*#U.KY.46%TE#;(W%,2L\PW:0C6TV#:;25EO > M?#`Q8B;<::Q&X:74L4%1&9M,-J6J4[ > MI:`?&65NF.`=>#&AI/JEPLX\FK-51-E-DL'L%7*O M6&.*(*=N@;4EDI*7$T!%)Z.)3*<^PWHU&@S>$U_, at A0P(38W?!DX[DMB)!;6 > M at .U#`U-?.T#82Y3%@N*@7-ADGBLOE+DMN%QX$%"$#6\O%SDJ")"7:]19`HK8 > M?9>GPHY&8IH]%[=2ZLJ>JAW#X,>:@&F'&2&T>HS2T#$5#(H0.0H$@\"&'CK) > M%>"P.*F]'J&A55LAA=3C?`>3)UJ%4RF'VLPU:9B)@Q0LPU1%??B@@);0X$,N > MJ5D>`NX_>H&!@@9:"1;)D,H(PB,#>9#$PZ2)B0J2$Z9M+$-%2$"*#1"95)+- > M%#N*4>H_(E=*4P-Z at J)K&B3`W`S[1RY/T4+I#U8ML"6P*P]9A18B-W68,B M;F&0I:2A8:CX!J*KB?MH,G#71K[#5_;>V4(Z`CQ.Q7/2>C4RB6AM$ > M(CPY]C2UBH>PS/7]#U)&9\1XH.6**C'KXM,0T#ZO9!U-)I52'R)0I%888=31 > MI8SXZ*X:DZ)788QY2'N#H'A&LUT):@RVB1DTN99L`G9H=3(Y%MR[#D:PH:PA > M'O9\-`KFFUH:!\>^IV/W84DE**!=;2-$/B&LWB&3EX=JW;+0;)B&R7`O.Q$$ > M1+]'IJX<$T:*N1V8)TI*2*3%&4:."#AH/;'P#-3%?M]73$H>EE at R;9SB*%HL > M=B/B at D*Y-227U0&K.= M8%?@3PS)5%)D*9P9P8P4F;2VL15F+9B)V+=LR/7+(7,9S5'#U`S=0C!%2089 > MM65 at W$/7FKX6VG M,T-&LARITH,.)#4::B-!,MI84#J9*;61AC7WKY4]$R2._T4W>C1%?!]6`4]& > MO&MJTS$YU5=I)B;0VE"+C!A1OS$B6XR2M88Q,.734!A5!%-K0'1*7-::VP#7 > MT-WOXA**)"HA^'!C(T,W-QV[^1D%(Q#3?N1B=<7BY@%"![S"45VEPSY8(6F1 > MX6,0H46C/-"7MZMMVQ6:3:5S/2#7?-Q!P)"250;G6$!$!.FS(5!8XK[E`Q26 > M%"3.@$"D9&"/+W#(EKFI(B7#(B4X!EZ3!@GWZBNW0;*5NJ8D9&'1P/5<.LH$ > ML&9;J-,R8R&89+ at U'<&=L7C1R9VH72&SN,,8-=;9L\4P9AP%\9.MF][#2'KC > M#!+.C<;CCG`NKE'9A at .3-";A9H*">T5`--VL:+8G+&DID1=L#TR0*DT at +8VS > M3/`1FP;!M-)B)`#6TA&M at +4%2$..TD1LH+8UI8,X8S,'895ILB at XH4%(38U9 > MXI*[38.8SX+SY9YAL&`&Y&":#(&@BW=2`N.$^*XS*VL;&GNMC0S1Q%`B/X4@ > MUW>#,B7!%M!MB#U*$\-IJ M+2K8V)L2VVQ%1(Q at PYL!,$)X&'&3JQ.+BS4.K6O$+$.E92YEIH#BHY@?$5$? > MN$T>*$P;]'0AUL$,,[(G^'0/'0."=L.\TFY6?48>MS_!^.^71H<4@@8R4$S` > M0'\S`\TP,T#FO/U)04+.O;=4"E-.?7C@!?(T%=".M(?U?_?6R52D0%$%35K\ > 2KZT3?11`_\7 ` > end > > On Thu, May 28, 2020 at 8:41 AM markus schnalke > wrote: > > Hoi, > > personally I use fmt(1) a lot for email formatting and such. > Typically I only use the `-w' parameter. Now someone asked me > about > `-t' and `-c' of *GNU* fmt(1). I wasn't able to find good > documentation > on them. The manpage only tells that they have to do with > different > indentation for the first or first two lines. But what are the use > cases? How would source text for these parameters look like? > > A look into the description and rationale sections of POSIX, which > often provides helpful information, was not possible because > fmt(1) > is not part of POSIX (only fold(1) is). Why's that? Is it because > fmt(1) differs so much between Unix implementations? On BSD `-c' > centers text and `-t' sets tab widths. Plan 9 has none of these > options. But still, `-w' could have been standardized. Or was the > line filling algorithm different as well? How does fold(1) fit > into > the picture? > > Maybe you can answer some of these questions or give hints on > where > I could find answers myself. > > > meillo > -------------- next part -------------- An HTML attachment was scrubbed... URL: From imp at bsdimp.com Fri May 29 02:40:55 2020 From: imp at bsdimp.com (Warner Losh) Date: Thu, 28 May 2020 10:40:55 -0600 Subject: [TUHS] Latest 2.9BSD and 2.11BSD Message-ID: Greetings, What's the canonical source for patches to 2.9BSD and 2.11BSD? I see we have 2.11BSD patch 469 dated last month in the archive. Where does it come from? Has anybody climbed the hill to import all the patches into a git repo? I've found some mirrors, but moe.2bsd.org has been down for me for ages... How does Warren keep things up to date? I also have a (maybe faulty) memory of a similar series of patches to 2.9BSD because it was the last BSD to support non-split I&D space machines. yet a quick google search turns up nothing other than a set of patches dated August 1985 (also in our archive) and some changes for variants of hardware (pro, mscp). Is that it? Warner -------------- next part -------------- An HTML attachment was scrubbed... URL: From clemc at ccc.com Fri May 29 04:00:41 2020 From: clemc at ccc.com (Clem Cole) Date: Thu, 28 May 2020 14:00:41 -0400 Subject: [TUHS] fmt(1): history, POSIX, -t, -c In-Reply-To: References: <1jeHk5-5LM-00@marmaro.de> Message-ID: On Thu, May 28, 2020 at 12:10 PM Mary Ann Horton wrote: > It's nice to see a uuencode email attachment once again! > I was worried a MIME attachment might not get passed through the automation, so I stayed with tried and true methods. > Clem (who, by the way, is correct about the origin of fmt, it was part of > Kurt Shoens' Mail program at Berkeley) has perfect timing. This coming > Monday, 6/1/2020, is the 40th anniversary of the uuencode email attachment. > (The date is based on the date in the uuencode man page in the 2.8BSD and > 4.2BSD archives at https://www.tuhs.org/Archive/Distributions/UCB/ ) > I could not have told you date, but I do remember when you sent it ber and myself, specifically. There has been some amusing coverage of the "25th anniversary of the email attachment", commemorating Nat Borenstien's creation of MIME in 1992. And his scheme does not work on IBM Mainframes or 6-bit machine as is (Nat required an 8 bit path). Your scheme passed through all known systems at the time. > Any thoughts on a proper commemoration of the 40th anniversary? > Can't say I know the proper way to do that, other than to say thank you and acknowledge the hack as a darned creative solution to an issue a lot of us had. Clem -------------- next part -------------- An HTML attachment was scrubbed... URL: From rich.salz at gmail.com Fri May 29 04:35:51 2020 From: rich.salz at gmail.com (Richard Salz) Date: Thu, 28 May 2020 14:35:51 -0400 Subject: [TUHS] fmt(1): history, POSIX, -t, -c In-Reply-To: References: <1jeHk5-5LM-00@marmaro.de> Message-ID: I thought base-64 worked on ebcdic/ibm platforms as well. But either way, uuencode was the first and was very definitely a neat hack that worked and solved real problems. -------------- next part -------------- An HTML attachment was scrubbed... URL: From clemc at ccc.com Fri May 29 04:51:04 2020 From: clemc at ccc.com (Clem Cole) Date: Thu, 28 May 2020 14:51:04 -0400 Subject: [TUHS] fmt(1): history, POSIX, -t, -c In-Reply-To: References: <1jeHk5-5LM-00@marmaro.de> Message-ID: On Thu, May 28, 2020 at 2:36 PM Richard Salz wrote: > I thought base-64 worked on ebcdic/ibm platforms as well. > I do not believe it works properly with systems like CDC's display code (It needs octets). -------------- next part -------------- An HTML attachment was scrubbed... URL: From wkt at tuhs.org Fri May 29 07:49:54 2020 From: wkt at tuhs.org (Warren Toomey) Date: Fri, 29 May 2020 07:49:54 +1000 Subject: [TUHS] Latest 2.9BSD and 2.11BSD In-Reply-To: References: Message-ID: <20200528214954.GA22861@minnie.tuhs.org> On Thu, May 28, 2020 at 10:40:55AM -0600, Warner Losh wrote: > Greetings, > What's the canonical source for patches to 2.9BSD and 2.11BSD? Steven Schultz is still the canonical source for 2.11BSD patches. He sends them to me and I add them to the TUHS archive. Recently I asked him to roll a new install tape which had all the patches applied, at https://www.tuhs.org/Archive/Distributions/UCB/2.11BSD_patch457 > I see we have 2.11BSD patch 469 dated last month in the archive. Where > does it come from? Has anybody climbed the hill to import all the > patches into a git repo? I know somebody tried a while back and reported here. They found it wasn't possible to apply all the patches sequentially. I'd have to go look in the mail archive for details. Maybe it's time for someone else to have a go! Cheers, Warren From grog at lemis.com Fri May 29 10:18:53 2020 From: grog at lemis.com (Greg 'groggy' Lehey) Date: Fri, 29 May 2020 10:18:53 +1000 Subject: [TUHS] fmt(1): history, POSIX, -t, -c In-Reply-To: References: <1jeHk5-5LM-00@marmaro.de> Message-ID: <20200529001853.GB27423@eureka.lemis.com> On Thursday, 28 May 2020 at 9:30:09 -0400, Clem Cole wrote: > fmt was originally written by Kurt Shoens at UCB when he worked in Mail and > delivermail. That agrees with the FreeBSD man page: HISTORY The fmt command appeared in 3BSD. The version described herein is a complete rewrite and appeared in FreeBSD 4.4. AUTHORS Kurt Shoens Liz Allen (added goal length concept) Gareth McCaughan Greg -- Sent from my desktop computer. Finger grog at lemis.com for PGP public key. See complete headers for address and phone numbers. This message is digitally signed. If your Microsoft mail program reports problems, please read http://lemis.com/broken-MUA -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 163 bytes Desc: not available URL: From imp at bsdimp.com Fri May 29 10:59:22 2020 From: imp at bsdimp.com (Warner Losh) Date: Thu, 28 May 2020 18:59:22 -0600 Subject: [TUHS] Latest 2.9BSD and 2.11BSD In-Reply-To: <20200528214954.GA22861@minnie.tuhs.org> References: <20200528214954.GA22861@minnie.tuhs.org> Message-ID: On Thu, May 28, 2020 at 3:49 PM Warren Toomey wrote: > On Thu, May 28, 2020 at 10:40:55AM -0600, Warner Losh wrote: > > Greetings, > > What's the canonical source for patches to 2.9BSD and 2.11BSD? > > Steven Schultz is still the canonical source for 2.11BSD patches. He > sends them to me and I add them to the TUHS archive. > > Recently I asked him to roll a new install tape which had all the patches > applied, at > https://www.tuhs.org/Archive/Distributions/UCB/2.11BSD_patch457 Yea. The oldest one we have is patch 195. which is good news! > > I see we have 2.11BSD patch 469 dated last month in the archive. Where > > does it come from? Has anybody climbed the hill to import all the > > patches into a git repo? > > I know somebody tried a while back and reported here. They found it wasn't > possible to apply all the patches sequentially. I'd have to go look in > the mail archive for details. > > Maybe it's time for someone else to have a go! > I think so. There's 40 files that appear on a line starting with 'rm ' or 'Xrm ' (well maybe a few more if you count a non-functional lint removed, no way to know for sure due to the '*'). 10 of these files are either binaries, or are rendant man pages (meaning the canonical copy is elsewhere and in a pinch we could have a very close copy just omitting them entirely or copying from the canonical place). The binaries can be regenerated. There's 3 files in pcc that can likely be snagged from 2.10.1. There's 8 files named 'shortnames.h' that can be had from 2.10.1 as well. There's 2 files that were created and then later deleted. There's one non-existent file that was deleted. there's 10 toolchain related files that we can get from 2.10.1 and/or the CSRG SCCS tree (haven't checked to see if the PDP-11 versions are there, they aren't in the easily browsable svn conversion). The entire source for ar, nm and ld are removed, but I think that 2.10.1 are the same, and/or CSRG repo fallback. That leaves nsys.c as the only file not existing in 2.10.1, which makes sense... it implements the new system call convention in 2.11, and it too may be in the SCCS tree... So based on that, I think it's worth giving it a try... :) Comments? Warner -------------- next part -------------- An HTML attachment was scrubbed... URL: From doug at cs.dartmouth.edu Fri May 29 11:25:02 2020 From: doug at cs.dartmouth.edu (Doug McIlroy) Date: Thu, 28 May 2020 21:25:02 -0400 Subject: [TUHS] fmt(1): history, POSIX, -t, -c Message-ID: <202005290125.04T1P2Mp055697@tahoe.cs.Dartmouth.EDU> The title of the fmt man page in v9 was "ultra-simple text formatter". Gnu dropped the "ultra" in favor of AI. Sometimes it does a pretty job. Sometimes it messes up my typing style. Always it produces a an apparently wavering right margin, as it assigns a separate "optimal" line length to each paragraph. It's hard to imagine how this command could stray from classic Unix simplicity and intelligibility, but Gnu pulled it off. Doug From mstiller at me.com Fri May 29 15:10:50 2020 From: mstiller at me.com (Michael Stiller) Date: Fri, 29 May 2020 07:10:50 +0200 Subject: [TUHS] fmt(1): history, POSIX, -t, -c In-Reply-To: <20200529001853.GB27423@eureka.lemis.com> References: <1jeHk5-5LM-00@marmaro.de> <20200529001853.GB27423@eureka.lemis.com> Message-ID: It is also included in 2.9BSD, or was it backported: FMT(1) UNIX Programmer's Manual FMT(1) NAME fmt - simple text formatter SYNOPSIS fmt [ name ... ] DESCRIPTION Fmt is a simple text formatter which reads the concatenation of input files (or standard input if none are given) and produces on standard output a version of its input with lines as close to 72 characters long as possible. The spac- ing at the beginning of the input lines is preserved in the output, as are blank lines and interword spacing. Fmt is meant to format mail messages prior to sending, but may also be useful for other simple tasks. SEE ALSO Mail(1), nroff(1), roff(1) AUTHOR Kurt Shoens BUGS The program was designed to be simple and fast - for more complex operations, the standard text processors are likely to be more appropriate. > On 29. May 2020, at 02:18, Greg 'groggy' Lehey wrote: > > On Thursday, 28 May 2020 at 9:30:09 -0400, Clem Cole wrote: >> fmt was originally written by Kurt Shoens at UCB when he worked in Mail and >> delivermail. > > That agrees with the FreeBSD man page: > > HISTORY > The fmt command appeared in 3BSD. > > The version described herein is a complete rewrite and appeared in > FreeBSD 4.4. > > AUTHORS > Kurt Shoens > Liz Allen (added goal length concept) > Gareth McCaughan > > Greg > -- > Sent from my desktop computer. > Finger grog at lemis.com for PGP public key. > See complete headers for address and phone numbers. > This message is digitally signed. If your Microsoft mail program > reports problems, please read http://lemis.com/broken-MUA From grog at lemis.com Fri May 29 15:19:29 2020 From: grog at lemis.com (Greg 'groggy' Lehey) Date: Fri, 29 May 2020 15:19:29 +1000 Subject: [TUHS] fmt(1): history, POSIX, -t, -c In-Reply-To: References: <1jeHk5-5LM-00@marmaro.de> <20200529001853.GB27423@eureka.lemis.com> Message-ID: <20200529051929.GC27423@eureka.lemis.com> On Friday, 29 May 2020 at 7:10:50 +0200, Michael Stiller via TUHS wrote: >> On 29. May 2020, at 02:18, Greg 'groggy' Lehey wrote: >> On Thursday, 28 May 2020 at 9:30:09 -0400, Clem Cole wrote: >>> fmt was originally written by Kurt Shoens at UCB when he worked in Mail and >>> delivermail. >> >> That agrees with the FreeBSD man page: >> >> HISTORY >> The fmt command appeared in 3BSD. >> ... > > It is also included in 2.9BSD, or was it backported: > > ... > > BUGS > The program was designed to be simple and fast - for more > complex operations, the standard text processors are likely > to be more appropriate. This paragraph is also in the FreeBSD man page, verbatim. The whole man page is at https://www.freebsd.org/cgi/man.cgi?query=fmt&apropos=0&sektion=0&manpath=FreeBSD+12.1-RELEASE+and+Ports&arch=default&format=html Greg -- Sent from my desktop computer. Finger grog at lemis.com for PGP public key. See complete headers for address and phone numbers. This message is digitally signed. If your Microsoft mail program reports problems, please read http://lemis.com/broken-MUA -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 163 bytes Desc: not available URL: From clemc at ccc.com Fri May 29 23:39:01 2020 From: clemc at ccc.com (Clem Cole) Date: Fri, 29 May 2020 09:39:01 -0400 Subject: [TUHS] fmt(1): history, POSIX, -t, -c In-Reply-To: References: <1jeHk5-5LM-00@marmaro.de> <20200529001853.GB27423@eureka.lemis.com> Message-ID: On Fri, May 29, 2020 at 1:11 AM Michael Stiller via TUHS < tuhs at minnie.tuhs.org> wrote: > It is also included in 2.9BSD, or was it backported: > Just recompiled. I don't think this was one he had to make any changes too. As Mary Ann and I said, Kurt wrote as part of the UCB Mail package [which includes delivermail(8) - which was the moral parent to sendmail(8)]. The whole key is that Keith did not have a Vax at the Math department (they had an 11/70 with max memory) and wanted all of the cool programs that were being created on the Vax. Remember, VM is automatic overlays. So first with the kernel, and then later with user code, larger and larger programs were enabled and many of the programs for the Vax migrated to the PDP-11, as people ran out of address space (IIRC: one the first user programs that needed to use overlays was ex/vi. Again, as I recall the original wnj version by then was such a mess, getting a new/cleaner code base was a large impetus for Keith to start writing nvi). Anyway, many smaller programs 'just worked' and the original fmt(1) command was pretty simple. As Doug so wisely observed: "It's hard to imagine how this command could stray from classic Unix simplicity and intelligibility, but Gnu pulled it off." -------------- next part -------------- An HTML attachment was scrubbed... URL: From imp at bsdimp.com Sat May 30 01:43:12 2020 From: imp at bsdimp.com (Warner Losh) Date: Fri, 29 May 2020 09:43:12 -0600 Subject: [TUHS] fmt(1): history, POSIX, -t, -c In-Reply-To: References: <1jeHk5-5LM-00@marmaro.de> <20200529001853.GB27423@eureka.lemis.com> Message-ID: On Fri, May 29, 2020 at 7:40 AM Clem Cole wrote: > > > On Fri, May 29, 2020 at 1:11 AM Michael Stiller via TUHS < > tuhs at minnie.tuhs.org> wrote: > >> It is also included in 2.9BSD, or was it backported: >> > Just recompiled. I don't think this was one he had to make any changes > too. As Mary Ann and I said, Kurt wrote as part of the UCB Mail package > [which includes delivermail(8) - which was the moral parent to > sendmail(8)]. > > The whole key is that Keith did not have a Vax at the Math department > (they had an 11/70 with max memory) and wanted all of the cool programs > that were being created on the Vax. Remember, VM is automatic overlays. > So first with the kernel, and then later with user code, larger and larger > programs were enabled and many of the programs for the Vax migrated to the > PDP-11, as people ran out of address space (IIRC: one the first user > programs that needed to use overlays was ex/vi. Again, as I recall the > original wnj version by then was such a mess, getting a new/cleaner code > base was a large impetus for Keith to start writing nvi). > > Anyway, many smaller programs 'just worked' and the original fmt(1) > command was pretty simple. As Doug so wisely observed: "It's hard to > imagine how this command could stray from classic Unix simplicity and intelligibility, > but Gnu pulled it off." > While Berkeley arguably bloated things somewhat in improving its functionality, gnu said 'here, hold my beer' in the 90s and we're still holding the beer. Warner -------------- next part -------------- An HTML attachment was scrubbed... URL: From clemc at ccc.com Sat May 30 02:12:05 2020 From: clemc at ccc.com (Clem Cole) Date: Fri, 29 May 2020 12:12:05 -0400 Subject: [TUHS] fmt(1): history, POSIX, -t, -c In-Reply-To: References: <1jeHk5-5LM-00@marmaro.de> <20200529001853.GB27423@eureka.lemis.com> Message-ID: The beer is well beyond the 'suck point' and it's time to throw it out. On Fri, May 29, 2020 at 11:43 AM Warner Losh wrote: > > > On Fri, May 29, 2020 at 7:40 AM Clem Cole wrote: > >> >> >> On Fri, May 29, 2020 at 1:11 AM Michael Stiller via TUHS < >> tuhs at minnie.tuhs.org> wrote: >> >>> It is also included in 2.9BSD, or was it backported: >>> >> Just recompiled. I don't think this was one he had to make any changes >> too. As Mary Ann and I said, Kurt wrote as part of the UCB Mail package >> [which includes delivermail(8) - which was the moral parent to >> sendmail(8)]. >> >> The whole key is that Keith did not have a Vax at the Math department >> (they had an 11/70 with max memory) and wanted all of the cool programs >> that were being created on the Vax. Remember, VM is automatic overlays. >> So first with the kernel, and then later with user code, larger and larger >> programs were enabled and many of the programs for the Vax migrated to the >> PDP-11, as people ran out of address space (IIRC: one the first user >> programs that needed to use overlays was ex/vi. Again, as I recall the >> original wnj version by then was such a mess, getting a new/cleaner code >> base was a large impetus for Keith to start writing nvi). >> >> Anyway, many smaller programs 'just worked' and the original fmt(1) >> command was pretty simple. As Doug so wisely observed: "It's hard to >> imagine how this command could stray from classic Unix simplicity and intelligibility, >> but Gnu pulled it off." >> > > > While Berkeley arguably bloated things somewhat in improving its > functionality, gnu said 'here, hold my beer' in the 90s and we're still > holding the beer. > > Warner > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mah at mhorton.net Sat May 30 03:14:10 2020 From: mah at mhorton.net (Mary Ann Horton) Date: Fri, 29 May 2020 10:14:10 -0700 Subject: [TUHS] fmt(1): history, POSIX, -t, -c In-Reply-To: References: <1jeHk5-5LM-00@marmaro.de> <20200529001853.GB27423@eureka.lemis.com> Message-ID: <0222efc8-e02f-7de8-dc94-9cba389c98a4@mhorton.net> On 5/29/20 6:39 AM, Clem Cole wrote: > > > On Fri, May 29, 2020 at 1:11 AM Michael Stiller via TUHS > > wrote: > > It is also included in 2.9BSD, or was it backported: > > Just recompiled. I don't think this was one he had to make any changes > too. As Mary Ann and I said, Kurt wrote as part of the UCB Mail > package [which includes delivermail(8) - which was the moral parent to > sendmail(8)]. fmt was literally part of Mail - it compiled in the same source directory, and was considered user agent (UA) code. delivermail/sendmail was separate, also from Berkeley, but written primarily by Eric Allman, and was the mail transport agent (MTA). It's in 2.8BSD as well. > The whole key is that Keith did not have a Vax at the Math department > (they had an 11/70 with max memory) and wanted all of the cool > programs that were being created on the Vax.   Remember, VM is > automatic overlays.   So first with the kernel, and then later with > user code, larger and larger programs were enabled and many of the > programs for the Vax migrated to the PDP-11, as people ran out of > address space (IIRC: one the first user programs that needed to use > overlays was ex/vi.  Again, as I recall the original wnj version by > then was such a mess, getting a new/cleaner code base was a large > impetus for Keith to start writing nvi). ex/vi didn't use overlays (unless you count split I/D). It fit in 64 bits by using ifdefs.  Less useful code, like supporting upper-case-only terminals, would be ifdeffed out on the pdp11.     Mary Ann -------------- next part -------------- An HTML attachment was scrubbed... URL: From paul at rileyriot.com Sun May 31 16:26:25 2020 From: paul at rileyriot.com (Paul Riley) Date: Sun, 31 May 2020 14:26:25 +0800 Subject: [TUHS] LSX on the PDP-11/03 (LSI-11) Message-ID: I've managed to acquire a PDP-11/03 with twin floppy drives (Sykes Datatronics RX01 or RX02 equivalents, not sure yet which). I've stumbled across LSX, and I have it running on SimH. I'm quite inexperienced with Unix, but it's something I want to learn well, having brushed against it at university in the '80s, and having played with Linux somewhat. I have some interest in Forth, but I don't like the block system of early forths such as FigForth, and I plan to create a new Forth based on FigForth, but supporting external source files. Anyway, I've tried compiling Hello World on LSX, and I get "1: External definition syntax" error. Some help would be nice, but more generally, is anyone on this list more than vaguely familiar with LSX, or 6th Edition itself? void main () { printf("Hello World!"); } It seems that the 7th Edition was the beginning of the standard library in C, and that this is missing in LSX. I'm not sure if printf is an intrinsic function in (6th Edition) C, or if it's from a library. My questions are a bit random, but looking to converse with others with LSX experience. Paul *Paul Riley* Email: paul at rileyriot.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From pnr at planet.nl Sun May 31 21:09:31 2020 From: pnr at planet.nl (Paul Ruizendaal) Date: Sun, 31 May 2020 13:09:31 +0200 Subject: [TUHS] non-blocking IO Message-ID: This time looking into non-blocking file access. I realise that the term has wider application, but right now my scope is “communication files” (tty’s, pipes, network connections). As far as I can tell, prior to 1979 non-blocking access did not appear in the Spider lineage, nor did it appear in the NCP Unix lineage. First appearance of non-blocking behaviour seems to have been with Chesson’s multiplexed files where it is marked experimental (an experiment within an experiment, so to say) in 1979. The first appearance resembling the modern form appears to have been with SysIII in 1980, where open() gains a O_NDELAY flag and appears to have had two uses: (i) when used on TTY devices it makes open() return without waiting for a carrier signal (and subsequent read() / write() calls on the descriptor return with 0, until the carrier/data is there); and (ii) on pipes and fifo’s, read() and write() will not block on an empty/full pipe, but return 0 instead. This behaviour seems to have continued into SysVR1, I’m not sure when EAGAIN came into use as a return value for this use case in the SysV lineage. Maybe with SysVR3 networking? In the Research lineage, the above SysIII approach does not seem to exist, although the V8 manual page for open() says under BUGS "It should be possible [...] to optionally call open without the possibility of hanging waiting for carrier on communication lines.” In the same location for V10 it reads "It should be possible to call open without waiting for carrier on communication lines.” The July 1981 design proposals for 4.2BSD note that SysIII non-blocking files are a useful feature and should be included in the new system. In Jan/Feb 1982 this appears to be coded up, although not all affected files are under SCCS tracking at that point in time. Non-blocking behaviour is changed from the SysIII semantics, in that EWOULDBLOCK is returned instead of 0 when progress is not possible. The non-blocking behaviour is extended beyond TTY’s and pipes to sockets, with additional errors (such as EINPROGRESS). At this time EWOULDBLOCK is not the same error number as EGAIN. It would seem that the differences between the BSD and SysV lineages in this area persisted until around 2000 or so. Is that a fair summary? - - - I’m not quite sure why the Research lineage did not include non-blocking behaviour, especially in view of the man page comments. Maybe it was seen as against the Unix philosophy, with select() offering sufficient mechanism to avoid blocking (with open() the hard corner case)? In the SysIII code base, the FNDELAY flag is stored on the file pointer (i.e. with struct file). This has the effect that the flag is shared between processes using the same pointer, but can be changed in one process (using fcntl) without the knowledge of others. It seems more logical to me to have made it a per-process flag (i.e. with struct user) instead. In this aspect the SysIII semantics carry through to today’s Unix/Linux. Was this semantic a deliberate design choice, or simply an overlooked complication? From ron at ronnatalie.com Sun May 31 22:34:04 2020 From: ron at ronnatalie.com (ron at ronnatalie.com) Date: Sun, 31 May 2020 08:34:04 -0400 Subject: [TUHS] LSX on the PDP-11/03 (LSI-11) In-Reply-To: References: Message-ID: <193c770d408ee131dbaa9a07dba6f068.squirrel@squirrelmail.tuffmail.net> > Anyway, I've tried compiling Hello World on LSX, and I get "1: External > definition syntax" error. Some help would be nice, but more generally, is > anyone on this list more than vaguely familiar with LSX, or 6th Edition > itself? > > void main () { > printf("Hello World!"); > } > > It seems that the 7th Edition was the beginning of the standard library in > C, and that this is missing in LSX. I'm not sure if printf is an intrinsic > function in (6th Edition) C, or if it's from a library. > First off, VOID MAIN is not legal in any standard version of C. Even when the language allows implementation defined extensions to the main signature, it must still return int. If you have a later version of language supported, you have to define printf rather than allowing it default define as an int returning function. Add #include From meillo at marmaro.de Sun May 31 22:35:25 2020 From: meillo at marmaro.de (markus schnalke) Date: Sun, 31 May 2020 14:35:25 +0200 Subject: [TUHS] fmt(1): history, POSIX, -t, -c In-Reply-To: <1jeHk5-5LM-00@marmaro.de> References: <1jeHk5-5LM-00@marmaro.de> Message-ID: <1jfNBV-75W-00@marmaro.de> Hoi, thanks a lot to everyone who contributed information and oppinions. They were helpful. I've got a much better unterstanding of the situation now. It's so good to have a place like this mailing list! :-) meillo From meillo at marmaro.de Sun May 31 23:01:48 2020 From: meillo at marmaro.de (markus schnalke) Date: Sun, 31 May 2020 15:01:48 +0200 Subject: [TUHS] mh/hm, mmh (was: fmt(1): history, POSIX, -t, -c) In-Reply-To: References: <1jeHk5-5LM-00@marmaro.de> Message-ID: <1jfNb2-7JV-00@marmaro.de> Hoi. [2020-05-28 09:46] Clem Cole > > [...] until > I finally switched from mh (actually the hm version) to the Gmail interface as > my MUI client.  Would you be so kind to explain a bit about the hm version of MH. Ten years ago I wanted to improve nmh, because I found it bad that it took me months to configure it in so many ways to get it usable for modern emailing. Even at that point I hadn't found some of its cool features, which all were deactivated by default. I argued but couldn't convince the nmh community. Later I used my master's thesis as the opportunity to create an experimental version of nmh, to convince by demonstration. Have a look at my master's thesis, if you like: http://marmaro.de/docs/master/ Actually it became a fork, now named mmh. The project's still active. Especially Philipp Takacs has done a lot, among that replacing m_getfld(), a highly optimized mail reading function. See the pre-mmh version of it for an entertaining read: http://git.marmaro.de/?p=mmh;a=blob;f=docs/m_getfld.c.humor;h=46449095d This is our replacement: http://git.marmaro.de/?p=mmh;a=blob;f=sbr/m_getfld2.c;h=b9a618d16 I'm much interested in any MH background. Shockingly I cannot recall having read about hm before ... meillo From pnr at planet.nl Sun May 31 23:50:11 2020 From: pnr at planet.nl (Paul Ruizendaal) Date: Sun, 31 May 2020 15:50:11 +0200 Subject: [TUHS] LSX on the PDP-11/03 (LSI-11) Message-ID: <083374AA-1CF0-4341-93C9-795A7E8E3BBA@planet.nl> > I've stumbled across LSX, and I have it running on SimH. I'm quite inexperienced with Unix, but it's something I want to learn well, having brushed against it at university in the '80s, and having played with Linux somewhat. I think you will experience a sizeable learning curve. You will be working with Unix and C as it stood around 1975 and that is substantially different from what it was in the 80’s. That said, I know from personal experience that it is an intriguing journey and certainly not impossible to do. > Some help would be nice, but more generally, is anyone on this list more than vaguely familiar with LSX, or 6th Edition itself? Many on this list are familiar with 6th edition. The best way to learn the internals of 6th edition is the “Lions’ book”: https://www.amazon.com/Lions-Commentary-Unix-John/dp/1573980137 Coming from today’s perspective (or a 1980’s one), you will find the following key challenges: - The version of C used for 6th edition is different from the 1980’s. Amongst other things the syntax of the assignment operators changed (not += but =+), the syntax for initialisation changed (not 'int a = 3', but 'int a 3’), the ‘long’ datatype (32 bit on a PDP-11) did not exist, the ‘void’ keyword did not exist, structs could not be assigned, passed or returned (only pointers to structs), etc. - The stdio library did not exist yet, in its place there was the ‘portable i/o library’. This may be the hardest part to get used to. - The file system was 16-bit based throughout. This has implications for stat(), lseek() did not exist (its precursor seek() used additional whence values to move the file pointer in 512 byte increments), etc. When it comes to LSX, there are a few people who have experience with it on the this list (that I know of - there may be many more). First of all, Heinz Lycklama, the original creator of LSX, appears to read this list from time to time. Second, Leonid Broukhis and Serge Vakulenko (who managed to recover the LSX sources 20 years ago) might be reading the list. They took the trouble to port LSX to the Soviet BK-0010 computer, an LSI-11 type system some 15 years ago: https://github.com/sergev/bkunix You can stand on their shoulders, as they already took the trouble to convert the kernel source from 1975 C to 1980’s C and to create a stdio compatible library for it; they are using the 2.11BSD C compiler, which generates tighter code than the 1975 compiler — hence they could squeeze a bit more functionality in. Third, I found the BK-0010 port some 5 years ago and used that as base to create a version that would run on a small TI990 clone: http://www.stuartconner.me.uk/mini_cortex/mini_cortex.htm This work later evolved into a stock 6th edition kernel and is now a curious mix of stuff dating from 1975 to 1985. Your main challenge will be that neither the BK-0010 work nor my work will run on your hardware as-is. I think you have two possible paths forward. The first is to learn C and the library as it stood in 1975, the second is to take the BK-0010 code and to make it run on a stock LSI-11 again.