Thu, 10 Sep 1998 17:33:43 +0000 Stephen Adler Subject: find out which TD is giving problems. This is a short tutorial on finding out which TD is not working. Last night's problem was due to a lookup memory problem in one of the TD control boards. The ssp, in its final stage of ssp-initing loads the TD control boards with all the data necessary to operate. This includes loading the lookup memories used by Level 1.1. The lookup memory failed to load properly for this one TD control board. The output from sspinit as seen on PPC window #3 (or cable segment #3) was this. ddLogTask-e787ppc03: Dump of block MORE_INFO from SSP PA 0x101 ddLogTask-e787ppc03: 42434454 1111704660 TDCB ddLogTask-e787ppc03: 49494e49 1229540937 INII ddLogTask-e787ppc03: 52435746 1380144966 FWCR ddLogTask-e787ppc03: 43435553 1128486227 SUCC ddLogTask-e787ppc03: 55534454 1431520340 TDSU ddLogTask-e787ppc03: 43304345 1127236421 EC0C ddLogTask-e787ppc03: 00000007 7 .... ddLogTask-e787ppc03: 00000003 3 .... ddLogTask-e787ppc03: 0000032f 815 /... ddLogTask-e787ppc03: 00000347 839 G... ddLogTask-e787ppc03: 4c494146 1279869254 FAIL ddLogTask-e787ppc03: 4c494146 1279869254 FAIL ddLogTask-e787ppc03: 4c494146 1279869254 FAIL ddLogTask-e787ppc03: Closing-RPC-link The slot number which failed was #7. This corresponds to the second TD counting from the right in TD crate 601. (middle right TD crate) In general, when you get an SSP dump like the one above after an SSP init fails or if data taking stops due to an SSP error or cable segment error, then the way to figure out which board or SSP failed is to look at the numbers in the dump. In general the dump will have 4 character words with something legible looking like, FAIL or TDSU or FWCR or something like that in the right most column along with illegible characters like G..., or .... or ..\. etc. What I do is look down this column until I see the first set of illegible characters. This indicates a slot number or cable segment address number. In this case, EC0C ment error in checking the lookup memories (I think) and the following illegible string '....', which is really the number 7, look at the second column from the right, is the slot number which generated the error. There will be other problems and you have to look at the error code (the 4 letter text string) and the numbers which follow to figure out what the exact nature of the error is and which module is generating such an error. Sometimes I have to resort to searching through the ssp source code to figure out what the error means. (Renee, is there someplace you and John have documented all the SSP errors besides in the code?) I hope this small tutorial helps.... Steve. P.S. Another TIP. All output which you see going to the PPC window is also written to a log file. These log files are: /online/ppclog/e787ppc01.log ..... for the output of PPC01 or cable segment #1 /online/ppclog/e787ppc02.log ..... for the output of PPC02 or cable segment #1 /online/ppclog/e787ppc03.log ..... for the output of PPC03 or cable segment #1 ------------------------------------------------------------------------------ Thu, 10 Sep 1998 18:25:31 +0000 Stephen Adler nanoseconds after my last message was received by interested parties I was asked to write about the fastbus crate setup. (i.e. how do I know which create is involved...) Instead of writing a diatribe on the architecture of the PPC/SFI/Cable segment split, I figure its best not to try to explain how to figure out which crate or SSP is giving the error but to just write up a cheat sheet. I hope the cheat sheet needs no explaining.... Fastbus | sspcheck | PPC01 | PPC02 | PPC03 Crate | address | Xterm | Xterm | Xterm | from ku7 | Window | Window | Window ---------------------------------------------------------- Trigger | 801 | 101 | | Fera | 802 | 102 | | ADC1 | 803 | 103 | | ADC2 | 804 | 104 | | TDC1 | 807 | 107 | | TDC2 | 808 | 108 | | TDC3 | 809 | 109 | | CCDA | 80A | 10A | | CCDB | 80B | 10B | | CCDC | 80C | 10C | | CCDD | 80D | 10D | | TD00 | A01 | | 101 | TD01 | 601 | | | 101 TD02 | 612 | | | 112 TD03 | A13 | | 113 | TD04 | 614 | | | 114 TD05 | 615 | | | 115 TD06 | A16 | | 116 | TD07 | A17 | | 117 | TD08 | A18 | | 118 | | | | | I guess this table does need explaining.... Column 1 is the Fastbus crate in question. (Or SSP in fastbus create to be exact!) Column #2 is the addressed used by sspcheck on bnlku7. (i.e. accessing the ssp's using Bob Hackenberg's ssp software over the branch bus/BBFC) Column #3 is the address used by the PPC and is the one listed the beginning of the MORE INFO SSP dump. Likewise for column's #4 and #5 for PPC2 and PPC3 respectively. Now you are going to want to know physically which fastbus create is which in the upstairs counting house. (Someone else besides John H., Renee and myself should know...) TD Racks Master Trigger Rack Rack ------------------------------------- ---------- | TD06 | TD03 | TD00 | Master | |Trigger | ------------------------------------- ---------- | TD07 | TD04 | TD01 | Fera | | | ------------------------------------- ---------- | TD08 | TD05 | TD02 | | | | ------------------------------------- ---------- ADC TDC Rack Rack ------------------- | ADC1 | TDC2 | ------------------- | ADC2 | TDC1 | ------------------- | | TDC3 | ------------------- CCD Racks next to the stairs next to the work bench ------------------- ------------------- | | | | | | ------------------- ------------------- | CCDD | CCDA | | CCDB | CCDC | ------------------- ------------------- | | | | | | ------------------- -------------------