program instshift(instin, instshiftp, instout, output); (* instshift: shift coordinates of delila instructions by Rye Shultzaberger shultzab@ncifcrf.gov http://www.lecb.ncifcrf.gov/~shultzab/ modified by Dr. Thomas D. Schneider National Institutes of Health National Cancer Institute Center for Cancer Research Nanobiology Program Molecular Information Theory Group Frederick, Maryland 21702-1201 toms@ncifcrf.gov permanent email: toms@alum.mit.edu (use only if first address fails) http://www.ccrnp.ncifcrf.gov/~toms/ *) label 1; (* end of program *) const (* begin module version *) version = 1.63; (* of instshift.p 2009 Apr 27 2005 Sep 24, 1.51: proper version upgrade 2005 Sep 13, 1.50: spelling: asterisk 2004 Sep 9, 1.49: upgrade for GPC 2004 Jul 8, 1.48: handle name quotes on get line! allow 'piece' inside name. 2002 May 4, 1.47: pointer to glossary 2001 Oct 11, 1.43: Ignore quote marks in a piece instruction 2001 Oct 5, 1.42: single quotes exclude double quotes in proc quotecomment 2001 Aug 8, 1.40: create instruction complement 2001 Aug 3, 1.37: cleanup 2001 Aug 3, 1.36: inst -> instin so I don't go crazy. 2001 May 21, 1.33: fix documentation on what happens if direction not specified. 2001 May 21, 1.32: technical report: what happens if direction not specified. 2001 May 9, 1.31: bug: does not handle direction in r mode. 2000 Dec 6, 1.30: documentation upgrade (TDS) 2000 Dec 6, 1.29: range control, automatic upgrade of parameters (TDS) 2000 Dec 6, 1.28: reset trigger properly (TDS) 2000 Dec 6, 1.27: fix bug: 'get' inside strings - account for strings (TDS) 2000 Dec 5, 1.21: fix bug: comparen at eoln kills rest of inst! (TDS) 2000 Dec 5, 1.19: apply 'same' only if appropriate to instout (TDS) 2000 Dec 5, 1.18: rename variables to account for COMPLETE delila instruction (TDS) 2000 Dec 5, 1.17: bug fix, now reads double cr correctly. (TDS) 2000 Dec 4, 1.13: bug fix when from range was zero in writeshift. (TDS) 2000 Nov 28, 1.12: fix documentation (TDS) 2000 May 10, 1.08: read instructions free format (TDS) 2000 May 9, 1.07: output file renamed instout rather than shiftinst (TDS) 2000 Feb 23, 1.03: fix bug in reading inst comments origin 1999 July 2 *) updateversion = 1.58; (* defines lowest acceptable current parameter file *) (* end module version *) (* begin module describe.instshift *) (* name instshift: shift coordinates of delila instructions synopsis instshift(instin: in, instshiftp: inout, instout: out, output: out) files instin: input: instruction file for delila instout: output: the shifted inst file instshiftp: parameters to control the program. The file must contain the following parameters, one per line: parameterversion: The version number of the program. This allows the user to be warned if an old parameter file is used. The program is smart enough to recognize that there is an old parameter file, and will upgrade the file to the latest version. numbtoshift (integer): The number of bases to shift, if you want the -9 position to be your new zero coordinate, then this number would be -9. If you want the +9 position to be be your new zero coordinate, then this number would be 9. rangechange (char) fromrange (integer) torange (integer) Range change control. If rangechange is: n: no change r: change both fromrange and torange to the given values f: change just fromrange to the given value t: change just torange to the given value s: change both fromrange and torange to match the shift Capital letters (NRFTS) mean to create the complementary Delila instruction after doing the rangechange (except N where only the complement is generated). The s mode allows the range to change so that only the zero base is altered but that sequence obtained is the same as before. This only matters if one is trying to avoid sequences on the edge of a binding site, for example a second copy of the same site that one does not want to show up in the primary sequence logo. output: messages to the user description Shift all the coordinates in an inst file, keeping the same logo but changing the zero coordinate. examples With a parameter numtoshift of 1000 and no range change, the instructions: get from 0 +100 to same -100; get from 4 -100 to 5 +100 direction +; get from 6 -100 to same +100 direction +; get from 7 +100 to 8 -100 direction -; get from 9 +100 to same -100 direction -; become: get from -1000 +100 to same -100; get from 1004 -100 to 1005 +100 direction +; get from 1006 -100 to same +100 direction +; get from -993 +100 to -992 -100 direction -; get from -991 +100 to same -100 direction -; example instshiftp: 1.29 version of instshift that this parameter file is designed for. +10 number of bases to shift the coordinate system r -10 +10 rangechange: nrft fromrange torange documentation see also {parameter file example:} instshiftp {the program that uses the instructions:} delila.p {program to remove comments:} nocom.p {The instshift program can be used to adjust delila instructions for binding sites with any symmetry. A treatise on binding site symmetries is available:} http://www.lecb.ncifcrf.gov/~toms/glossary.html#binding_site_symmetry author Rye Kent Shultzaberger additional features by Tom Schneider bugs * The program will fail on circular pieces. 2001 May 1: This may not be true anymore. * The delilacomment routine will trigger both comparen and comcurly error messages if a comcurly is not closed and there are more comparens after that. * A possible future option would be to complement the instructions so that one gets the complementary sequences. Is this ever useful? * It is standard practice to name some promoters with single quote marks. These were identified as quote strings, killing the rest of the inst. To solve this, once a piece instruction is identified, the rest of the Delila instruction, up to the semicolon, is not checked for quotes or comments. This COULD cause a problem IF someone has a semicolon inside a comment BEFORE the true semicolon end of the Delila instruction. piece bubba' (@ this would cause a problem ; get @) ; (where '@' is substitued for '*') technical notes * Starting with this inst file: get from 0 +100 to same -100; and using this instshift parameter file: 1.29 version of instshift that this parameter file is designed for. -10 number of bases to shift the coordinate system n -40 +5 rangechange: nrft fromrange torange instshift gives an instout file like this: get from -10 +100 to same -100 direction +; Since the user did not specify the direction, the program does not know the direction to put at the end of the instruction. In this case the user may think that the direction is negative, but on a circular piece this is a valid instruction. If you do not give the direction in the inst file, the program defaults to not giving a direction. MUTATION RATE COMPENSATION Since the mutations are generated by Delila instructions, they have to be of a form that does not require knowing the previous base. This is done by using the insertion form i79,81t. However, given a particular base b, this base will be generated 1/4 of the time. So the actual mutation rate would be lower than requested. To compensate for this the program multiplies the requested frequency by 4/3. Note that high mutation rates result in very long mutation string names that Delila cannot handle without increasing the parameter namelength. Note that if namelength is increased, it must be done in catal also and the delila catalogues must be rebuilt. *) (* end module describe.instshift *) (* module filler.const *) fillermax = 50; (* the size of the filler array for a string *) (* module filler.const from prgmod.p 4.20 *) (* begin module string.const *) maxstring = 2000; (* the maximum string *) (* end module string.const version = 4.86; (@ of prgmod.p 2004 Sep 8 *) (* begin module datetime.const *) datetimearraylength = 19; (* length of dataarray for dates, It is just long enough to include the 4 digit year - solving the year 2000 problem: 1980/06/09 18:49:11 123456789 123456789 1 2 *) (* end module datetime.const version = 1.17; (@ of timegpc.p 2002 Oct 9 *) type (* begin module string.type *) stringptr = ^string; (* pointer to a string *) string = record (* a string of characters *) letters: array[1..maxstring] of char; (* the letters in the string *) length: integer; (* the number of characters in the string *) current: integer; (* the letter we are working on *) next: stringptr; (* the next string in a series *) end; (* end module string.type version = 4.86; (@ of prgmod.p 2004 Sep 8 *) (* begin module filler.type *) (* the following is an array used to fill a string. it is convenient to have it much shorter than the maxstring, so that it is easy to fill the string using procedure fillstring. the user must declare the value of constant fillermax. *) filler = packed array[1..fillermax] of char; (* end module filler.type version = 4.86; (@ of prgmod.p 2004 Sep 8 *) (* begin module trigger.type *) trigger = record (* an object to be searched for *) seek: string; (* the characters looked for *) state: integer; (* how close to triggering we are *) skip: boolean; (* trigger not found- skip the line *) found: boolean (* the trigger was found *) end; (* end module trigger.type version = 4.86; (@ of prgmod.p 2004 Sep 8 *) (* begin module parameters *) parameters = record parameterversion: real; (* parameter version number *) numbtoshift: integer; (* how many bases to shift *) rangechange: char; (* range change control: n: no change r: change both fromrange and torange f: change just fromrange t: change just torange *) fromrange: integer; (* new from range *) torange: integer; (* new to range *) end; (* end module parameters *) (* begin module datetime.type *) (* array for dates *) datetimearray = packed array[1..datetimearraylength] of char; (* end module datetime.type version = 1.17; (@ of timegpc.p 2002 Oct 9 *) var instin, (* file used by this program *) instshiftp, (* file used by this program *) instout: text; (* file used by this program *) (* begin module skipblanks *) (* 2003 July 31: tab is considered a blank character *) function isblank(c: char): boolean; (* is the character c blank or tab? *) begin isblank := (c = ' ') or (c = chr(9)) end; procedure skipblanks(var thefile: text); (* skip over blanks until a non-blank, or end of line, is found *) begin while isblank(thefile^) and not eoln(thefile) do get(thefile); end; procedure skipnonblanks(var thefile: text); (* skip over nonblanks until a blank, or end of line, is found *) begin while (not isblank(thefile^)) and not eoln(thefile) do get(thefile); end; procedure skipcolumn(var thefile: text); (* skip over a data column *) begin skipblanks(thefile); skipnonblanks(thefile) end; (* end module skipblanks version = 7.67; {of delmod.p 2004 Sep 8} *) (* begin module halt *) procedure halt; (* stop the program. the procedure performs a goto to the end of the program. you must have a label: label 1; declared, and also the end of the program must have this label: 1: end. examples are in the module libraries. this is the only goto in the delila system. *) begin writeln(output,' program halt.'); goto 1 end; (* end module halt version = 1.17; (@ of timegpc.p 2002 Oct 9 *) (* begin module copyaline *) (* for transfering header info *) procedure copyaline(var fin, fout: text); (* copy a line from file fin to file fout *) begin (* copyaline *) while not eoln(fin) do begin fout^ := fin^; put(fout); get(fin) end; readln(fin); writeln(fout); end; (* copyaline *) (* end module copyaline version = 7.67; {of delmod.p 2004 Sep 8} *) (* begin module clearstring *) procedure clearstring(var ribbon: string); (* empty the string *) var index: integer; (* to the ribbon *) begin (* clearstring *) with ribbon do begin for index := 1 to maxstring do letters[index] := ' '; length := 0; current := 0; end end; (* clearstring *) procedure initializestring(var ribbon: string); (* start the string with a nil pointer. This routine should be called before doing linked list work. This allows the standard string routines to clear the string without killing the pointer. *) begin (* initializestring *) clearstring(ribbon); ribbon.next := nil; end; (* initializestring *) (* end module clearstring version = 4.86; (@ of prgmod.p 2004 Sep 8 *) (* begin module filler.fillstring *) procedure fillstring(var s: string; a: filler); (* this procedure makes it reasonably easy to fill the string s with characters. one calls the procedure as: *) (* 1 2 3 4 5 *) (* 12345678901234567890123456789012345678901234567890 *) (* fillstring(s, 'this-is-the-string '); the two comments make it easy to line the characters up. also, for this example, it was assumed that the length of filler as defined by the constant fillermax was 50. *) var length: integer; (* of the string without trailing blanks *) index: integer; (* of s *) begin (* fillstring *) clearstring(s); length := fillermax; while (length > 1) and (a[length] = ' ') do length := pred(length); if (length = 1) and (a[length] = ' ') then begin writeln(output, 'fillstring: the string is empty'); halt end; for index := 1 to length do s.letters[index] := a[index]; s.length := length; s.current := 1 end; (* fillstring *) (* end module filler.fillstring version = 7.67; {of delmod.p 2004 Sep 8} *) (* begin module filler.filltrigger *) procedure filltrigger(var t: trigger; a: filler); (* fill the trigger t *) begin (* filltrigger *) fillstring(t.seek,a) end; (* fillstring *) (* end module filler.filltrigger version = 7.67; {of delmod.p 2004 Sep 8} *) (* begin module trigger.proc *) (* this module allows one to scan a series of characters, as from an array or a file, and to "trigger" or detect a simple string in the series. the advantage of the trigger is that several triggers can "observe" a stream of characters at once, each looking for a different thing. some other modules required: interact.const, interact.type *) procedure resettrigger(var t: trigger); (* reset the trigger to ground state *) begin (* resettrigger *) with t do begin state := 0; skip := false; found := false end end; (* resettrigger *) procedure testfortrigger(ch: char; var t: trigger); (* look at the character ch. if it is part of the trigger (at the current trigger state), then the trigger state goes higher. if it is not part of the trigger then the trigger state is reset, skip is true and one should skip onward to find the trigger. if the trigger is found, found is true. 1996 Sep 12: Bug found! In the case of a trigger "ab", the program used to miss it for situations like "aab". This was because at the first a it would step up. Then it would see the second a and recognize that was not part of ab. It would fail to realize that it could be the start of a new one. The code now accounts for that possibility. *) begin (* testfortrigger *) with t do begin state := succ(state); { writestring(list,seek); writeln(list,'testfortrigger seek.letters[',state:1,']:', seek.letters[state],' ch:',ch); } if seek.letters[state] = ch then begin skip := false; if state = seek.length then found := true else found := false end else begin (* it failed. But wait! It could be the beginning of a NEW trigger string! *) if seek.letters[1] = ch then begin state := 1; skip := false; found := false end else begin (* reset trigger *) state := 0; skip := true; found := false end end end end; (* testfortrigger *) (* end module trigger.proc version = 7.67; {of delmod.p 2004 Sep 8} *) (***************************************************************************) (***************************************************************************) (* begin module readshift *) procedure readshift(var inst: text; var fromcoord: integer; var tocoord: integer; var fromrange: integer; var torange: integer; var orientation: integer); (* Read from-to coordinates from inst file. The routine now reads an entire delila instruction of the form: get from 7647 -100 to 7666 +100 direction +; or get from 8647 -100 to same +100 direction +; HOWEVER, the get must already have been read. This allows it to be found in the instructions. NOTE: orientation can be -1, +1 or 0. 0 simply means it was not read in. Generally positive orientation can be assumed in this case, but not always. For the range computation, assume it is positive. 2009 Apr 10, 1.62: allow inst without from-ranges This form now works: get from 8647 to same +100 direction +; *) begin { skipcolumn(inst); (* skip 'get' *) } skipcolumn(inst); (* skip 'from' *) read(instin,fromcoord); (* read 'from' coordinate *) {copyaline(inst, output); halt;} {writeln(output,'readshift: fromcoord is ',fromcoord:1);} skipblanks(inst); (* 2009 apr 10: skip to 'to' or fromrange *) if inst^ = 't' then fromrange := 0 else read(instin,fromrange); (* read 'from' range *) {writeln(output,'readshift: fromrange is ',fromrange:1);} skipblanks(inst); (* 2009 apr 27: skip to 'to' *) if inst^ <> 't' then begin writeln(output, 'expected "to" instruction, but found this:'); copyaline(inst, output); halt; end; skipcolumn(inst); (* skip 'to' *) (* read the word 'same' and use the fromcoord or read the tocoord *) skipblanks(inst); (* prepare to test for 'same' in next letter *) if inst^ = 's' then begin skipnonblanks(inst); (* skip 'same' *) tocoord := fromcoord; end else begin read(instin, tocoord); end; {writeln(output,'readshift: inst^ is "',inst^,'"');} read(instin, torange); (* read 'to' range *) {writeln(output,'readshift: torange is ', torange:1);} (* NOTE: semicolon is NOT READ *) skipblanks(inst); (* prepare to test for 'direction' in next letter *) if inst^ = 'd' then begin skipnonblanks(inst); (* skip 'direction' *) skipblanks(inst); (* skip to direction *) { writeln(output,'readshift: inst^ is "',inst^,'"');} if inst^ = '-' then orientation := -1 else orientation := +1; get(inst) (* move past the orientation character *) end else orientation := +0; (* 2001 May 21 bug fix *) { for testing: if (fromcoord = 1286) or (fromcoord = 177) then begin writeln(output,'readshift: fromcoord is ',fromcoord:1); writeln(output,'readshift: fromrange is ',fromrange:1); writeln(output,'readshift: tocoord is ', tocoord:1); writeln(output,'readshift: torange is ', torange:1); writeln(output,'readshift: orientation is ', orientation:1); end; } end; (* end module readshift *) (* begin module shift *) procedure shift(var fromcoord: integer; var tocoord: integer; fromrange: integer; torange: integer; numbtoshift: integer); (* shift from coordinate by amount numtoshift *) begin (* determine orientation *) if fromrange < 0 then begin fromcoord := fromcoord + numbtoshift; tocoord := tocoord + numbtoshift; end else if fromrange > 0 then begin fromcoord := fromcoord - numbtoshift; tocoord := tocoord - numbtoshift; end else begin (* fromrange = 0 *) if torange >= 0 then begin (* allow for 0 wide *) fromcoord := fromcoord + numbtoshift; tocoord := tocoord + numbtoshift; end else if torange < 0 then begin fromcoord := fromcoord - numbtoshift; tocoord := tocoord - numbtoshift; end; end; end; (* end module shift *) (* begin module writeshift *) procedure writeshift(var instout: text; fromcoord: integer; tocoord: integer; fromrange: integer; torange: integer; orientation: integer); (* Write out the shifted instructions. No carriage return is given at the end, to allow any remaining parts to be copied. The final semicolon (or other material) is to be copied from the inst file later. The word 'get' is NOT output because it is copied during reading. *) procedure signednumber(var instout: text; i: integer); (* produce the number, signed *) begin if i < 0 then write(instout, ' ',i:1) else write(instout,' +',i:1); end; begin write(instout, ' from ',fromcoord:1); signednumber(instout, fromrange); write(instout, ' to '); if fromcoord = tocoord then write(instout, 'same') else write(instout, tocoord:1); signednumber(instout, torange); if orientation <> 0 then begin write(instout, ' direction '); if orientation = -1 then write(instout,'-') else write(instout,'+'); end; end; (* end module writeshift *) (* begin module delilacomment *) procedure delilacomment(previous, current: char; var comparen, comcurly: boolean); (* Detect delila comments. Given the previous and current characters, determine if we are inside either a parenthesis comment or a curly comment. *) const asterisk = '*'; (* asterisk *) leftcurly = '{'; (* left curly parenthesis *) leftparen = '('; (* left parenthesis *) rightcurly = '}'; (* right curly parenthesis *) rightparen = ')'; (* right parenthesis *) begin if ((previous = leftparen) and (current = asterisk)) then comparen := true; if comparen and (not comcurly) and ((previous = asterisk) and (current = rightparen)) then comparen := false; if (not comparen) and (current = leftcurly) then comcurly := true; if comcurly and (not comparen) and (current = rightcurly) then comcurly := false; end; (* end module delilacomment *) (* begin module quotecomment *) procedure quotecomment(previous, current: char; var comparen, comcurly: boolean; var singlequote, doublequote: boolean); (* Detect quotes and comments. Given the previous and current characters, determine if we are inside either a parenthesis comment or a curly comment, a single quote or a double quote. 2001 Oct 5, 1.42: single quotes exclude double quotes in proc quotecomment A delila title like title "Hanah Margalit's promoter database from-75 to 25"; made the program think that everything after the single quote was inside quotes! *) const single = ''''; (* single quote *) double = '"'; (* double quote *) begin if not (comparen or comcurly) then begin { 2001 Oct 5 Bad code: if current = single then singlequote := not singlequote; if current = double then doublequote := not doublequote; fix: } if current = single then if not doublequote then singlequote := not singlequote; if current = double then if not singlequote then doublequote := not doublequote; end; if not (singlequote or doublequote) then begin delilacomment(previous,current,comparen,comcurly); end; end; (* end module quotecomment *) (* begin module copytheinst *) procedure copytheinst(var fin, fout: text; var previous, c: char; var comparen, comcurly: boolean; var singlequote, doublequote: boolean; checkquotecomment: boolean); (* Copy the current delila instruction from file fin to file fout until the end of the instruction. Check for comments using quotecomment while we go if checkquotecomment is true. *) begin (* copytheinst *) while (not eof(fin)) and (c <> ';') do begin if eoln(fin) then begin readln(fin); writeln(fout); c := ' '; previous := c; end else if not eof(fin) then begin previous := c; read(fin, c); write(fout, c); if checkquotecomment then quotecomment(previous,c,comparen,comcurly,singlequote,doublequote); end; end; end; (* copytheinst *) (* end module copytheinst *) (* begin module random *) procedure random(var seed: real); (* random generator 2. version = 1.01 of random.2 1990 Oct 2 origin 1986 December 31 Test this routine with the program tstrnd. written by David Masternarde *) (* This random number generator is based on a shift register with a single bit of feedback, as described in Electronics for Neurobiologists, by Brown, Maxfield and Moraff, MIT press 1973, referencing Random Process Simulation and Measurement by Korn, McGraw-Hill 1966. The random seed rand, a number between 0 and 1 exclusive, is converted to an integer between 1 and 2**23-1, inclusive. This 23-bit number is shifted right one bit and the output of the last (23rd) bit and the 9th bit are added modulo 2 (exclusive orred) and fed back into the new first bit. This is done between 4 and 11 times, depending on the last 3 bits of the original number. The result is converted back to a real number between 0 and 1 from which the 23 bit integer can be recovered on the next call. The 23-bit shift register goes through all 2**23-1 values before repeating; the repetition frequency of this algorithm could be less or greater depending on the seed, because of the random number of multiple shifts per call. *) (* powers of 2 *) const pow14=16384; pow15=32768; pow22= 4194304; pow23=8388608; var iseed, (* integer shift register *) i, nrep: integer; (* index, number of times to do shift *) begin (* random *) iseed := round(seed*pow23); (* convert to 23 bit number *) if (iseed<1) or (iseed>=pow23) then iseed := 1; nrep := 4 + iseed mod 8; (* do it 4 to 11 times based on last 3 bits *) for i:= 1 to nrep do (* if last bit and 9th bit are equal, feed back a 0, otherwise a 1 *) if( odd(iseed) = ((iseed mod pow15) >= pow14)) then iseed := iseed div 2 else iseed := (iseed div 2) + pow22; seed := iseed/pow23; end; (* random *) (* end module random *) (* begin module getdatetime *) procedure getdatetime(var adatetime: datetimearray); (* get the date and time into a single array from the system clock. adatetime contains the date: 1980/06/09 18:49:11 ye mo da ho mi se (year, month, day, hour, minute, second). As of 2000 February 18, the Sun Pascal compiler requires a formatting statement. This statement allows the date to be generated in this standard Delila format in a single call. Information about the formatting statement is available on the manual page for date in Unix. If a computer does not have this method, see the 'oldgetdatetime' routine in delmod.p (http://www.lecb.ncifcrf.gov/~toms/delila/delmod.html) for some conversion code. GPC Functions: function GetUnixTime (var MicroSecond : Integer) : UnixTimeType; http://agnes.dida.physik.uni-essen.de/~gnu-pascal/gpc_109.html#SEC109 7.10.8 Date And Time Routines procedure GetTimeStamp (var t : TimeStamp); function Date (t : TimeStamp) : packed array [1 .. DateLength] of Char; function Time (t : TimeStamp) : packed array [1 .. TimeLength] of Char; DateLength and TimeLength are implementation dependent constants. GetTimeStamp (t) fills the record `t' with values. If they are valid, the Boolean flags are set to True. TimeStamp is a predefined type in the Extended Pascal standard. It may be extended in an implementation, and is indeed extended in GPC. For the full definition of `TimeStamp', see section 8.255 TimeStamp. *) var t: TimeStamp; (* begin module pluckdigit *) function pluckdigit(number, logplace:integer): char; (* return the digit at the place value ('logplace') position of number. example: pluckdigit(13625, 3) = 3 pluckdigit(13625, 4) = 1 This routine was taken from module numberdigit in prgmod.p, but is modified so as not to give the sign. Instead it gives zeros above the digits. 'myabsolute' replaced 'absolute', which is apparently a keyword for GPC. The name is kept for to keep the code looking similar to its origin. *) var place: integer; (* the exponent of logplace *) count: integer; (* used to make place *) myabsolute: integer; (* the absolute value of number *) acharacter: char; (* the character to be returned *) procedure digit; (* extract a digit at the place position *) var tenplace: integer; (* ten times place *) z: integer; (* an intermediate value *) d: integer; (* the digit extracted *) begin (* digit *) tenplace:=10*place; z:=myabsolute-((myabsolute div tenplace)*tenplace); if place = 1 then d:=z else d:= z div place; case d of 0: acharacter:='0'; 1: acharacter:='1'; 2: acharacter:='2'; 3: acharacter:='3'; 4: acharacter:='4'; 5: acharacter:='5'; 6: acharacter:='6'; 7: acharacter:='7'; 8: acharacter:='8'; 9: acharacter:='9'; end end; (* digit *) begin (* pluckdigit *) place:=1; for count:=1 to logplace do place:=10*place; if number=0 then begin acharacter:='0' end else begin myabsolute:=number; if myabsolute >= place then digit else acharacter := '0' end; pluckdigit:=acharacter end; (* pluckdigit *) (* end module pluckdigit *) begin (* according to: http://agnes.dida.physik.uni-essen.de/~gnu-pascal/gpc_109.html#SEC109 *) GetTimeStamp(t); (* Predefined time stamp: http://agnes.dida.physik.uni-essen.de/~gnu-pascal/gpc_389.html#SEC389 TimeStamp = {@@packed} record DateValid, TimeValid : Boolean; Year : Integer; Month : 1 .. 12; Day : 1 .. 31; DayOfWeek : 0 .. 6; { 0 means Sunday } Hour : 0 .. 23; Minute : 0 .. 59; Second : 0 .. 61; { to allow for leap seconds } MicroSecond : 0 .. 999999 end; *) with t do begin if TimeValid then begin { writeln(output,'valid time'); writeln(output,'year =',year:4); writeln(output,'month =',month:2); writeln(output,'day =',day:2); writeln(output,'hour =',hour:2); writeln(output,'minute =',minute:2); writeln(output,'second =',second:2); } adatetime := 'year/mm/dd hh:mm:ss'; adatetime[ 1] := pluckdigit(Year,3); adatetime[ 2] := pluckdigit(Year,2); adatetime[ 3] := pluckdigit(Year,1); adatetime[ 4] := pluckdigit(Year,0); adatetime[ 6] := pluckdigit(Month,1); adatetime[ 7] := pluckdigit(Month,0); adatetime[ 9] := pluckdigit(Day,1); adatetime[10] := pluckdigit(Day,0); adatetime[12] := pluckdigit(Hour,1); adatetime[13] := pluckdigit(Hour,0); adatetime[15] := pluckdigit(Minute,1); adatetime[16] := pluckdigit(Minute,0); adatetime[18] := pluckdigit(Second,1); adatetime[19] := pluckdigit(Second,0); end else begin writeln(output,'getdatetime: invalid time!'); halt; end end; { Sun compiler method: date(adatetime,'%Y/%m/%d %H:%M:%S'); } end; (* end module getdatetime version = 1.17; (@ of timegpc.p 2002 Oct 9 *) (* begin module timeseed *) (* Read the computer date and time. Reverse the order of the digits and put a decimal point in front. This gives a fraction between zero and one that varies quite quickly, and is always unique (if the computer has sufficient accuracy). It is to be used as a seed to a random number generator. This has the nice property that the seed changes every second and does not repeat for thousands of years! *) procedure addtoseed(var seed, power: real; c: char); (* add the digit represented by c to the seed at the power position *) var n: integer; (* the character represented by c *) begin (* addtoseed *) power := power/10; { writeln(output,'addtoseed, c = ',c); writeln(output,'addtoseed, ord(c) = ',ord(c)); } n := ord(c) - ord('0'); if (n < 0) or (n > 9) then begin writeln(output,'timeseed: error in datetime'); writeln(output,'it contains "',c,'" which is not a number.'); writeln(output,'The getdatetime routine must be fixed.'); halt; end; seed := seed + power*n end; (* addtoseed *) procedure makeseed(adatetime: datetimearray; var seed: real); (* convert adatetime to a real number in seed, reversed order Here is the standard adatetime format: 123456789 123456789 1 1980/06/09 18:49:11 *) var power: real; (* a digit of the seed such as 0.01 *) begin seed := 0.0; power := 1.0; addtoseed(seed, power, adatetime[19]); addtoseed(seed, power, adatetime[18]); (* : *) addtoseed(seed, power, adatetime[16]); addtoseed(seed, power, adatetime[15]); (* : *) addtoseed(seed, power, adatetime[13]); addtoseed(seed, power, adatetime[12]); (* *) addtoseed(seed, power, adatetime[10]); addtoseed(seed, power, adatetime[ 9]); (* / *) addtoseed(seed, power, adatetime[ 7]); addtoseed(seed, power, adatetime[ 6]); (* / *) addtoseed(seed, power, adatetime[ 4]); addtoseed(seed, power, adatetime[ 3]); addtoseed(seed, power, adatetime[ 2]); addtoseed(seed, power, adatetime[ 1]); end; procedure orderseed(adatetime: datetimearray; var seed: real); (* convert adatetime to a real number in seed, normal order *) var power: real; (* a digit of the seed such as 0.01 *) begin seed := 0.0; power := 1.0; addtoseed(seed, power, adatetime[ 3]); addtoseed(seed, power, adatetime[ 4]); addtoseed(seed, power, adatetime[ 6]); addtoseed(seed, power, adatetime[ 7]); (* / *) addtoseed(seed, power, adatetime[ 9]); addtoseed(seed, power, adatetime[10]); (* / *) addtoseed(seed, power, adatetime[12]); addtoseed(seed, power, adatetime[13]); (* *) addtoseed(seed, power, adatetime[15]); addtoseed(seed, power, adatetime[16]); (* : *) addtoseed(seed, power, adatetime[18]); addtoseed(seed, power, adatetime[19]); end; procedure timeseed(var seed: real); (* read the computer date and time. reverse the order of the digits and put a decimal point in front. this gives a fraction between zero and one that varies quite quickly, and is always unique (if the computer has sufficient accuracy). it is to be used as a seed to a random number generator. *) var adatetime: datetimearray; (* a date and time *) begin (* timeseed *) getdatetime(adatetime); { writeln(output,'timeseed: adatetime: ',adatetime); } makeseed(adatetime, seed); end; (* timeseed *) (* end module timeseed version = 1.17; (@ of timegpc.p 2002 Oct 9 *) (* begin module instloop *) procedure instloop(var instin, instout: text; params: parameters); (* Read through inst file and produce instout file. Skip comments. *) const mutationmax = 1000; (* maximum length of sequences that can be mutated *) pwid = 10; (* characters for reporting mutation fractions *) pdec = 8; (* decimals for reporting mutation fractions *) var b: integer; (* index to bases *) bases: array[0..3] of char; (* the bases *) c: char; (* generic character *) comparen: boolean; (* if true, we are inside a standard comment *) comcurly: boolean; (* if true, we are inside a curly comment *) previous: char; (* the previous value of c *) fromcoord: integer; (* coordinate of zero base (ie From part) *) tocoord: integer; (* coordinate of end base (ie To part) *) fromrange: integer; (* the lower bound of the interval grabbed *) torange: integer; (* the upper bound of the interval grabbed *) gettrigger: trigger; (* trigger for get instruction *) piecetrigger: trigger; (* trigger for piece instruction *) orientation: integer; (* orientation of the interval *) singlequote, doublequote: boolean; (* single and double quote states *) notfirst: boolean; (* first get in a set? *) {mmm} (* variables for random mutations *) seed: real; (* a random number, seed for the next random number *) mutationarray: array[0..mutationmax] of char; (* storage for mutations *) storepoint: integer; (* storage point in the mutationarray *) aremutations: boolean; (* are there mutations in this sequence? *) countmutations: integer; (* count of mutations *) countbases: integer; (* count of bases possibly mutated *) begin comcurly := false; comparen := false; singlequote := false; doublequote := false; previous := ' '; c := ' '; bases[0] := 'a'; bases[1] := 'c'; bases[2] := 'g'; bases[3] := 't'; (* set up variables for random mutations *) {mmm} if (0 < params.initialseed) and (params.initialseed < 1) then seed := params.initialseed else timeseed(seed); countmutations := 0; countbases := 0; (* 1 2 3 4 5 *) (* 12345678901234567890123456789012345678901234567890 *) filltrigger(gettrigger, 'get '); resettrigger(gettrigger); filltrigger(piecetrigger, 'piece '); resettrigger(piecetrigger); while not eof(instin) do begin if eoln(instin) then begin readln(instin); writeln(instout); previous := ' '; resettrigger(gettrigger); resettrigger(piecetrigger); if singlequote then begin writeln(output, 'Multiline single quote string'); halt end; if doublequote then begin writeln(output, 'Multiline double quote string'); halt end; end else begin previous := c; read(instin, c); write(instout, c); (* 2001 Oct 11 Ignore single or double quote marks in a piece instruction. *) testfortrigger(c, piecetrigger); if piecetrigger.found then begin (* 2004 July 8: Wait a second! Don't do this if we are inside a quote!! *) if not (comparen or comcurly or singlequote or doublequote) then copytheinst(instin, instout,previous,c,comparen,comcurly, singlequote,doublequote, false); end else begin quotecomment(previous,c,comparen,comcurly,singlequote,doublequote); {zzz if comparen then write(instout,'P'); if comcurly then write(instout,'C'); if singlequote then write(instout,'S'); if doublequote then write(instout,'_D'); } if not (comparen or comcurly or singlequote or doublequote) then begin testfortrigger(c, gettrigger); if gettrigger.found then begin readshift(instin, fromcoord, tocoord, fromrange, torange, orientation); case params.rangechange of (* nrftNRFT *) 'n','N': ;(* no change to range *) 'r','R': begin if (orientation = +1) or (orientation = 0) then begin (* assume positive orientation when orientation is not given *) fromrange := +params.fromrange; torange := +params.torange; end else begin fromrange := -params.fromrange; torange := -params.torange; end end; 'f','F': begin fromrange := params.fromrange; end; 't','T': begin torange := params.torange; end; 's','S': begin if (orientation = +1) or (orientation = 0) then begin (* assume positive orientation when orientation is not given *) fromrange := fromrange - params.numbtoshift; torange := torange - params.numbtoshift; end else begin fromrange := fromrange + params.numbtoshift; torange := torange + params.numbtoshift; end end; end; shift(fromcoord, tocoord, fromrange, torange, params.numbtoshift); (**************************************************************) (* should one shift and then complement or the other way around? *) if params.rangechange in ['N','R','F','T','S'] then begin fromrange := -fromrange; torange := -torange; orientation := -orientation; end; (**************************************************************) {uuu} writeshift(instout, fromcoord, tocoord, fromrange, torange, orientation); copytheinst(instin, instout,previous,c,comparen,comcurly, singlequote,doublequote, true); end; end; end end end; if comparen then begin writeln(output,'ERROR: unclosed (* *) comment'); end; if comcurly then begin writeln(output,'ERROR: unclosed {} comment'); end; {mmm} if (params.splatter = 'm') then begin writeln(instout, '{ Mutation results:'); writeln(instout, 'total bases: ', countbases:1); writeln(instout, 'total mutations: ', countmutations:1); writeln(instout, 'fraction, mutations/base: ', (countmutations/countbases):pwid:pdec); writeln(instout, 'requested mutations/base: ', (params.mutations):pwid:pdec); writeln(instout, '}'); end; end; (* end module instloop *) (* begin module readparameters *) procedure readparameters(var instshiftp: text; var params: parameters); (* read the parameters. The routine will upgrade the parameter file if it is old. The original parameter file was: ---- 1.00 version of instshift that this parameter file is designed for. +10 number of bases to shift the coordinate system ---- *) procedure upgradeto129; (* upgrade to version 1.29 parameters: ---- 1.29 version of instshift that this parameter file is designed for. +10 number of bases to shift the coordinate system n -10 +10 rangechange: nrft fromrange torange ---- *) begin with params do begin writeln(output,'upgrading instshiftp to version 1.29 ...'); readln(instshiftp, numbtoshift); rewrite(instshiftp); writeln(instshiftp,'1.29 version of instshift', ' that this parameter file is designed for.'); if numbtoshift > 0 then write(instshiftp,'+'); writeln(instshiftp,numbtoshift:1,' number of bases to shift', ' the coordinate system'); writeln(instshiftp,'n -10 +10 rangechange: nrft fromrange torange'); end; end; procedure upgradeto134; { upgrade to version 1.34 parameters: ---- 1.34 version of instshift that this parameter file is designed for. +10 number of bases to shift the coordinate system n -10 +10 rangechange: nrft fromrange torange (**) ---- } var hold: text; (* for holding the current parameters *) begin with params do begin writeln(output,'upgrading instshiftp to version 1.34 ...'); reset(instshiftp); readln(instshiftp); (* skip previous version number *) rewrite(hold); while not eof(instshiftp) do copyaline(instshiftp, hold); rewrite(instshiftp); writeln(instshiftp,'1.34 version of instshift', ' that this parameter file is designed for.'); reset(hold); while not eof(hold) do copyaline(hold, instshiftp); end; end; procedure upgradeto158; { upgrade to version 1.58 parameters: ---- 1.34 version of instshift that this parameter file is designed for. +10 number of bases to shift the coordinate system n -10 +10 rangechange: nrft fromrange torange (**) ---- } var hold: text; (* for holding the current parameters *) begin with params do begin writeln(output,'upgrading instshiftp to version 1.58 ...'); reset(instshiftp); readln(instshiftp); (* skip previous version number *) rewrite(hold); while not eof(instshiftp) do copyaline(instshiftp, hold); rewrite(instshiftp); writeln(instshiftp,'1.58 version of instshift', ' that this parameter file is designed for.'); reset(hold); while not eof(hold) do copyaline(hold, instshiftp); end; end; begin with params do begin reset(instshiftp); readln(instshiftp, parameterversion); if round(100*parameterversion) < round(100*updateversion) then begin writeln(output, 'You have an old parameter file!'); writeln(output, 'UPGRADING parameter file!'); if parameterversion < 1.29 then upgradeto129; if parameterversion < 1.34 then upgradeto134; if parameterversion < 1.58 then upgradeto158; reset(instshiftp); readln(instshiftp); (* skip the version number *) end; readln(instshiftp, numbtoshift); readln(instshiftp, rangechange, fromrange, torange); if not (rangechange in ['n','r','f','t','s', 'N','R','F','T','S']) then begin write (output,'rangechange must be one of "nrftsNRFTS",'); writeln(output,' but it was "',rangechange,'"'); halt; end; end; end; (* end module readparameters *) (* begin module instshift.themain *) procedure themain(var instin, instshiftp, instout: text); (* the main procedure of the program *) var params: parameters; (* the parameters to be read in *) begin writeln(output,'instshift ',version:4:2); readparameters(instshiftp, params); reset(instin); rewrite(instout); instloop(instin, instout, params); write(instout,'(* instshift ', version:4:2); write(instout,', shifted ', params.numbtoshift:1,' bases,'); case params.rangechange of (* nrft *) 'n','N': write(instout,' no change to range'); 'r','R': write(instout,' range set to ', params.fromrange:1, ' to ', params.torange:1); 'f','F': write(instout,' from of range set to ', params.fromrange:1); 't','T': write(instout,' to of range set to ', params.torange:1); 's','S': write(instout,' range size kept constant, shifted'); end; if params.rangechange in ['N','R','F','T'] then writeln(instout,' COMPLEMENT'); writeln(instout,' *)'); end; (* end module instshift.themain *) begin themain(instin, instshiftp, instout); 1: end.