program helix(xbook,ybook,hlist,helixp,output); (* helix: find helices between sequences in two books by thomas schneider modules from: delmods *) label 1; (* end of program *) const (* begin module version *) version = 3.24; (* of helix 1994 sep 5 origin before 1990 December 21 surely before 1984 ... *) (* end module version *) (* begin module describe.helix *) (* name helix: find helices between sequences in two books synopsis helix(xbook: in, ybook: in, hlist: out, helixp: in, output: out) files xbook: a book from the delila system ybook: a book from the delila system hlist: a list of helices between pieces in xbook and ybook. the first line is the program identification the second two lines are the x and y book titles the third line gives the minimum length or the maximum energy of helixes recorded the fourth line states whether or not g-u pairs are allowed the fifth line states whether or not energies are printed the following lines are the helices breaks between sequences are indicated. helixp: parameters that control the helix list. if helixp is empty, default values are used. otherwise, the file must contain three lines: 1. if this number is a positive integer, it specifies the minimum length in base pairs of the helixes written in hlist. if it is a negative real number, it specifies the maximum energy in kcal of the helixes written. 2. if the first character is a "g" then g-u pairs are allowed, otherwise not. 3. if the first character is an "e" then the energy of each helix will be written in hlist. output: messages to the user description All sequences in xbook are compared to all sequences in ybook. The complementary helices (of some minimum length and longer or of some maximum energy or less) are listed in hlist by the 5' ends of the helix on both sequences. This information, along with the length of the helix, determines the location of the helix. One can allow g-u pairing if desired. If the helix lengths desired are very short, it is better to use dotmat (see "technical notes" below). The new Rules are now used to calculate the helix. documentation delman.use.comparison J. V. Maizel, Jr. and R. P. Lenk PNAS 78: 7665-7609 (1981) Tinoco et al. Nature New Biology vol 246 pp 40-41, 1973. S. M. Freier, R. Kierzek, J. A. Jaeger, N. Sugimoto, M. H. Caruthers, T. Nelson, and D. H. Turner, "Improved free-energy paramters for predictions of RNA duplex stability" PNAS 83: 9373-9377 (1986) see also matrix.p, dotmat.p, keymat.p author Thomas D. Schneider bugs GU pairs and bulges are not done using the new data. An option for pair-wise (rather than multiplicative) comparisons of sequences would be nice. technical notes The shortest length helix ever recorded in hlist is determined by the constant absminlength. This overrides the parameters. *) (* end module describe.helix *) (* technical note: lines of hlist are in the format made by procedure hrecord. breaks between sequences are in the format made by procedure nextsequences *) (* more constants: *) absminlength = 2; (* the minimum length below which one should not use helix, and should use the dotmat program *) defminlenmaxene = 5; (* default for minlenmaxene *) defguallowed = false; (* default for guallowed *) defgivefreeenergy = false; (* default for givefreeenergy *) prime = ''''; (* single quote mark *) debugging = true; (* for debugging purposes *) (* begin module book.const *) (* constants needed for book manipulations *) dnamax = 3000; (* length of dna arrays *) namelength = 20; (* maximum key name length *) linelength = 80; (* maximum line readable in book *) (* end module book.const version = 'delmod 6.51 85 apr 17 tds/gds' *) type (* begin module book.type *) (* types needed for book manipulations *) chset = set of 'a'..'z'; (* types defined in book definition *) alpha = packed array[1..namelength] of char; (* this is not alfa *) (* name is a left justified string with blanks following the characters *) name = record letters: alpha; length: 0..namelength (* zero means an unspecified structure *) end; lineptr = ^line; line = record (* a line of characters *) letters: packed array [1..linelength] of char; length: 0..linelength; next: lineptr end; direction = (plus, minus, dircomplement, dirhomologous); configuration = (linear, circular); state = (on, off); header = record (* header of key *) keynam: name; (* key name of structure *) fulnam: lineptr; (* full name of structure *) note: lineptr (* note key *) end; (* base types *) base = (a,c,g,t); dnaptr = ^dnastring; dnarange = 0..dnamax; seq = packed array[1..dnamax] of base; dnastring = record part: seq; length: dnarange; next: dnaptr end; orgkey = record (* organism key *) hea: header; mapunit: lineptr (* genetic map units *) end; chrkey = record (* chromosome key *) hea: header; mapbeg: real; (* number of genetic map beginning *) mapend: real (* number of genetic map ending *) end; pieceptr = ^piece; piekey = record (* piece key *) hea: header; mapbeg: real; (* genetic map beginning *) coocon: configuration; (* configruation (circular/linear) *) coodir: direction; (* direction (+/-) relative to genetic map *) coobeg: integer; (* beginning nucleotide *) cooend: integer; (* ending nucleotide *) piecon: configuration; (* configruation (circular/linear) *) piedir: direction; (* direction (+/-) relative to coordinates *) piebeg: integer; (* beginning nucleotide *) pieend: integer; (* ending nucleotide *) end; piece = record key: piekey; dna: dnaptr end; reference = record pienam : name; (* name of piece referred to *) mapbeg : real; (* genetic map beginning *) refdir : direction; (* direction relative to coordinates *) refbeg : integer; (* beginning nucleotide *) refend : integer; (* ending nucleotide *) end; genkey = record (* gene key *) hea : header; ref : reference; end; trakey = record (* transcript key *) hea : header; ref : reference; end; markerptr = ^marker; markey = record (* marker key *) hea : header; ref : reference; sta : state; phenotype : lineptr; next : markerptr; end; marker = record key : markey; dna : dnaptr; end; (* end module book.type version = 'delmod 6.51 85 apr 17 tds/gds' *) var xbook, (* the primary sequences *) ybook, (* the secondaries compared to xbook *) hlist, (* list of exact helices *) helixp: (* parameters for the program *) text; xseq, yseq: pieceptr; (* sequences from xbook and ybook *) xnumber, ynumber: integer; (* numbers of each piece *) minlenmaxene: real; (* minimum helix length recorded in hlist or the highest energy when negative *) guallowed: boolean; (* gu pairs are allowed *) givefreeenergy: boolean; (* calculate the free energy of a helix and add it to the hlist *) (* begin module book.var *) (* ************************************************************************ *) (* global variables needed for book manipulations *) (* free storage: *) freeline: lineptr; (* unused lines *) freedna: dnaptr; (* unused dnas *) readnumber: boolean; (* whether to read a number from the notes, or to read in the notes *) number: integer; (* the number of the item just read *) numbered: boolean; (* true when the item just read is numbered *) skipunnum: boolean; (* a control variable to allow skipping of un-numbered items in the book *) (* ************************************************************************ *) (* end module book.var version = 'delmod 6.51 85 apr 17 tds/gds' *) (* begin module halt *) procedure halt; (* stop the program. the procedure performs a goto to the end of the program. you must have a label: label 1; declared, and also the end of the program must have this label: 1: end. examples are in the module libraries. this is the only goto in the delila system. *) begin writeln(output,' program halt.'); goto 1 end; (* end module halt version = 'delmod 6.51 85 apr 17 tds/gds' *) (* begin module package.getpiece *) (* ************************************************************************ *) (* begin module package.brpiece *) (* ************************************************************************ *) (* begin module book.basis *) (* procedures needed for book manipulations *) (* get procedures should be used for all linked lists of records *) procedure getline(var l: lineptr); (* obtain a line from the free line list or by making a new one *) begin if freeline<>nil then begin l:=freeline; freeline:=freeline^.next end else new(l); l^.length:=0; l^.next:=nil end; procedure getdna(var l: dnaptr); begin if freedna<>nil then begin l:=freedna; freedna:=freedna^.next end else new(l); l^.length:=0; l^.next:=nil end; (* clear procedures should be called each time the records are no longer needed failure to do this may result in a stack overflow. *) procedure clearline(var l: lineptr); (* return a line to the free line list *) var lptr: lineptr; begin if l<>nil then begin lptr:=l; l:=l^.next; lptr^.next:=freeline; freeline:=lptr end end; procedure cleardna(var l: dnaptr); var lptr: dnaptr; begin if l<>nil then begin lptr:=l; l:=l^.next; lptr^.next:=freedna; freedna:=lptr end end; procedure clearheader(var h: header); (* clear the header h (remove lines to free storage) *) begin with h do begin clearline(fulnam); while note<>nil do clearline(note) end end; procedure clearpiece(var p: pieceptr); (* clear the dna of the piece *) begin while p^.dna<>nil do cleardna(p^.dna); clearheader(p^.key.hea) end; function chartobase(ch:char):base; (* convert a character into a base *) begin case ch of 'a': chartobase:=a; 'c': chartobase:=c; 'g': chartobase:=g; 't': chartobase:=t end end; function basetochar(ba:base):char; (* convert a base into a character *) begin case ba of a: basetochar:='a'; c: basetochar:='c'; g: basetochar:='g'; t: basetochar:='t'; end end; function complement(ba:base):base; (* take the complement of ba *) begin case ba of a: complement:=t; c: complement:=g; g: complement:=c; t: complement:=a; end end; function pietoint(p: integer; pie: pieceptr): integer; (* p is a coordinate on the piece. we want to transform p into a number from 1 to n: an internal coordinate system for easy manipulation of piece coordinates *) var i: integer; (* an intermediate value *) begin with pie^.key do begin case piedir of plus: if p>=piebeg then i:=p-piebeg+1 else i:=(p-coobeg)+(cooend-piebeg)+2; minus: if p<=piebeg then i:=piebeg-p+1 else i:=(cooend-p)+(piebeg-coobeg)+2 end; pietoint:=i end end; function inttopie(i: integer; pie: pieceptr):integer; (* i is in the range 1 to some maximum. it is an internal coordinate system for the program. we want to do a coordinate transformation to obtain a value in the range of the piece called pie: i=1 corresponds to piebeg and i=its maximum corresponds to pieend *) var p: integer; (* an intermediate value *) begin with pie^.key do begin case piedir of plus: begin p:=piebeg+(i-1); if p>cooend then if coocon=circular then p:=p-(cooend-coobeg+1) end; minus: begin p:=piebeg-(i-1); if p '*' then begin writeln(output,' procedure skipstar: bad book'); writeln(output,' "*" expected as first character on the line, but "', thefile^,'" was found'); halt end; get(thefile); (* skip the star *) if thefile^ <> ' ' then begin writeln(output,' procedure skipstar: bad book'); writeln(output,' "* " expected on a line but "*', thefile^,'" was found'); halt end; get(thefile) (* skip the blank *) end; (* skipstar *) (* end module book.skipstar version = 'delmod 6.51 85 apr 17 tds/gds' *) (* begin module book.brreanum *) procedure brreanum(var thefile: text; var reanum: real); (* read a real number from the file *) begin skipstar(thefile); readln(thefile,reanum); end; (* end module book.brreanum version = 'delmod 6.51 85 apr 17 tds/gds' *) (* begin module book.brnumber *) procedure brnumber(var thefile: text; var num: integer); (* read a number from the file *) begin skipstar(thefile); readln(thefile,num) end; (* end module book.brnumber version = 'delmod 6.51 85 apr 17 tds/gds' *) (* begin module book.brname *) procedure brname(var thefile: text; var nam: name); (* read a name from the file *) var i: integer; (* an index to the name *) c: char; (* a character read *) begin (* brname *) skipstar(thefile); with nam do begin length:=0; repeat length:=succ(length); read(thefile,c); letters[length] := c until (eoln(thefile)) or (length>=namelength) or (letters[length]=' '); if letters[length]=' ' then length:=length-1; if length 'n' then begin skipstar(thefile); if not eoln(thefile) then begin if thefile^ = '#' then begin numbered := true; get(thefile); (* move past the number symbol *) read(thefile,number); end end; repeat readln(thefile) until thefile^ = 'n'; readln(thefile) end else readln(thefile) end end; (* brnotenumber *) (* end module book.brnotenumber version = 'delmod 6.51 85 apr 17 tds/gds' *) (* begin module book.brnote *) procedure brnote(var thefile: text; var note: lineptr); (* read note key *) var newnote: lineptr; (* the new note *) previousnote: lineptr; (* the last line of the notes *) begin (* brnote *) note:=nil; if thefile^ = 'n' then begin (* enter note *) readln(thefile); if thefile^ <> 'n' then begin (* abort null note (n/n) *) getline(note); newnote:=note; while thefile^ <> 'n' do begin (* wait until end of note *) brline(thefile,newnote); previousnote:=newnote; (* get next note *) getline(newnote^.next); newnote:=newnote^.next; end; (* last note was not used, so: *) clearline(newnote); previousnote^.next:=nil; readln(thefile) end else readln(thefile) end end; (* brnote *) (* end module book.brnote version = 'delmod 6.51 85 apr 17 tds/gds' *) (* begin module book.brheader *) procedure brheader(var thefile: text; var hea: header); (* read the header of a key. *) begin with hea do begin (* read key name *) brname(thefile,keynam); (* read full name *) getline(fulnam); brline(thefile,fulnam); (* read note key *) if readnumber then brnotenumber(thefile,note) else brnote(thefile,note) end end; (* end module book.brheader version = 'delmod 6.51 85 apr 17 tds/gds' *) (* begin module book.brpiekey *) procedure brpiekey(var thefile: text; var pie: piekey); (* read piece key *) begin with pie do begin brheader(thefile,hea); brreanum(thefile,mapbeg); brconfig(thefile,coocon); brdirect(thefile,coodir); brnumber(thefile,coobeg); brnumber(thefile,cooend); brconfig(thefile,piecon); brdirect(thefile,piedir); brnumber(thefile,piebeg); brnumber(thefile,pieend); end end; (* end module book.brpiekey version = 'delmod 6.51 85 apr 17 tds/gds' *) (* begin module book.brdna *) procedure brdna(var thefile: text; var dna: dnaptr); (* read in dna from thefile *) (* note: if the dna were circularized, by linking the last dnastring to the first, then the cleardna routine could not clear properly, and would loop forever... there is no reason to do that, since a simple mod function will allow one to access the circle. *) var ch: char; workdna: dnaptr; begin getdna(dna); workdna:=dna; ch:=getto(thefile,['d']); read(thefile,ch); (* skipstar *) while (ch = '*') do begin read(thefile,ch); (* skip blank *) repeat read(thefile,ch); if ch in ['a','c','g','t'] then begin if workdna^.length=dnamax then begin getdna(workdna^.next); workdna:=workdna^.next end; workdna^.length:=succ(workdna^.length); workdna^.part[workdna^.length]:=chartobase(ch) end until eoln(thefile); readln(thefile); (* go to next line *) read(thefile,ch); (* ch is either '*' or 'd' *) end; readln(thefile) end; (* end module book.brdna version = 'delmod 6.51 85 apr 17 tds/gds' *) (* begin module book.brpiece *) procedure brpiece(var thefile: text; var pie: pieceptr); (* read in a piece *) begin brpiekey(thefile,pie^.key); if numbered or (not skipunnum) then brdna(thefile,pie^.dna) end; (* end module book.brpiece version = 'delmod 6.51 85 apr 17 tds/gds' *) (* begin module book.brinit *) procedure brinit(var book: text); (* check that the book is ok to read, and set up the global variables for br routines *) begin (* brinit *) (* halt if the book is bad (first word is 'halt') or the first character is not * *) reset(book); if not eof(book) then begin (* check for the date line *) if book^ <> '*' then begin if book^ <> 'h' then writeln(output, ' this is not the first line of a book:') else writeln(output, ' bad book:'); write(output, ' '); while not (eoln(book) or eof(book)) do begin write(output, book^); get(book) end; writeln(output); halt end end else begin writeln(output, ' book is empty'); halt end; (* initialize free storage *) freeline:=nil; freedna:=nil; readnumber:=true; (* usually we read in numbers for items *) number:=0; (* arbitrary value *) numbered:=false; (* the piece has no number (none yet read in) *) skipunnum:=false; end; (* brinit *) (* end module book.brinit version = 'delmod 6.51 85 apr 17 tds/gds' *) (* ************************************************************************ *) (* end module package.brpiece version = 'delmod 6.51 85 apr 17 tds/gds' *) (* begin module book.getpiece *) procedure getpiece(var thefile: text; var pie: pieceptr); (* move to and read in the next piece in the book *) var ch: char; begin ch:=getto(thefile,['p']); (* get to the next p(iece) in the book *) if ch<>' ' then begin brpiece(thefile,pie); ch:=getto(thefile,['p']); (* read past closing p *) end end; (* end module book.getpiece version = 'delmod 6.51 85 apr 17 tds/gds' *) (* ************************************************************************ *) (* end module package.getpiece version = 'delmod 6.51 85 apr 17 tds/gds' *) (* begin module book.getbase *) function getbase(position: integer; pie: pieceptr):base; (* get a base from the nth position (internal coordinates) of the piece. no protection is made against positions outside the piece *) var workdna: dnaptr; p: integer; (* the last base of the dna part *) begin workdna:=pie^.dna; p:=dnamax; while position>p do begin p:=p+dnamax; workdna:=workdna^.next end; getbase:=workdna^.part[position-(p-dnamax)] end; (* end module book.getbase version = 'delmod 6.51 85 apr 17 tds/gds' *) (* begin module copyaline *) procedure copyaline(var fin, fout: text); (* copy a line from file fin to file fout *) begin (* copyaline *) while not eoln(fin) do begin fout^ := fin^; put(fout); get(fin) end; readln(fin); writeln(fout); end; (* copyaline *) (* end module copyaline version = 'delmod 6.51 85 apr 17 tds/gds' *) (* begin module basepair *) function basepair(xseq,yseq: pieceptr; x,y: integer; guallowed: boolean) :boolean; (* does xseq basepair to yseq at x to y? allow gu pair if guallowed is true. a series of if-then statements are used for speed. *) var bx,by: base; (* bases in the sequences *) begin bx := getbase(x,xseq); by := getbase(y,yseq); if bx = complement(by) then basepair := true else if guallowed then if (bx=g) and (by=t) then basepair := true else if (bx=t) and (by=g) then basepair := true else basepair := false else basepair := false end; (* end module basepair version = 'delmod 6.51 85 apr 17 tds/gds' *) (* begin module freeenergy *) (* this module contains: moveendsin: a procedure to find the core of a helix, where the core is the part that contributes the energy. gu pairs and single base pairs are removed from both ends. freeenergy: a function that uses moveendsin and then calculates the free energy of the core in kcal. *) procedure moveendsin(var x5bound, x3bound: integer; xpiece: pieceptr; var y5bound, y3bound: integer; ypiece: pieceptr); (* move the ends of the helix in. gu pairs contribute no energy, nor do single base pairs surrounded by gu pairs. these are removed. *) begin (* moveendsin *) (* move the bounds inward to the first occurance of base pairing. since a single base pair provides no energy, the successive position is also required to pair. the energy beyond these bounds is zero. *) (* move one end in, decrease x3bound, increase y5bound note 1: x3bound cannot go below succ(x5bound) because the call to getbase would object to looking before the x5" end. the x5 bound is brought up in the second while loop. *) while ((getbase(x3bound,xpiece) <> complement(getbase(y5bound,ypiece))) or (getbase(pred(x3bound),xpiece) <> complement(getbase(succ(y5bound),ypiece))) ) and (x3bound > succ(x5bound)) do begin x3bound := pred(x3bound); y5bound := succ(y5bound) end; (* move the other end in, increase x5bound, decrease y3bound note 2: for efficiency,x5bound can stop incrementation when it passes x3bound *) while ((getbase(x5bound,xpiece) <> complement(getbase(y3bound,ypiece))) or (getbase(succ(x5bound),xpiece) <> complement(getbase(pred(y3bound),ypiece))) ) and (x5bound < x3bound) do begin x5bound := succ(x5bound); y3bound := pred(y3bound) end end; (* moveendsin *) function freeenergy(x5start, x3start: integer; xpiece: pieceptr; y5start, y3start: integer; ypiece: pieceptr): real; (* evaluate the free energy of the pairing (x3start, y5start) to (x5start, y3start). interior loops are evaluated only for the case where both sides are equal in length. the units of the energy returned are kcalorie. based on Turner et al and tinoco et al nature new biology vol 246 pp 40-41, 1973. internal coordinates (1 to n) are used. gu pairs are eliminated from the ends by calling moveendsin. *) var x5, x3, y5, y3: integer; (* 5 and 3 prime ends on x and y pieces *) x5b, x3b, y5b, y3b: base; (* the bases corresponding to the above *) x5bound, x3bound, y5bound, y3bound: integer; (* the bounds of energy calculation *) energy: real; (* the current total energy *) bulgebases: integer; (* the number of bases involved in a bulge *) bulging: boolean; (* flag for continuing a bulge after a single match *) procedure readofftable; (* create a 4 by 4 table based on table 1 in Turner et al. we know at this point that both base pairs complement, so only one pair is used. *) var deltag: real; (* temporary curve *) begin case x5b of a: case x3b of a: deltag := -0.9; c: deltag := -2.1; g: deltag := -1.7; t: deltag := -0.9 end; c: case x3b of a: deltag := -1.8; c: deltag := -2.9; g: deltag := -2.0; t: deltag := -1.7 end; g: case x3b of a: deltag := -2.3; c: deltag := -3.4; g: deltag := -2.9; t: deltag := -2.1 end; t: case x3b of a: deltag := -1.1; c: deltag := -2.3; g: deltag := -1.8; t: deltag := -0.9 end; end; energy := energy + deltag end; procedure gupair; (* in the Freier et al paper, one pairing has exceptional energy: gu/ug *) begin energy := energy - 0.4 end; procedure continuebulge; (* the interior loop bulge continues *) begin bulgebases := bulgebases + 2 end; procedure beginbulge; (* begin a bulge, of the interior loop kind *) begin if not bulging then begin bulgebases := 2; bulging := true end else continuebulge (* only a single base matches, so it is to be ignored *) end; procedure endbulge; (* close the bulge by recording its energy *) begin (* perhaps it is only a single match *) if getbase(pred(x5),xpiece) = complement(getbase(succ(y3),ypiece)) then begin (* it is really the end of a bulge *) if (2 <= bulgebases) and (bulgebases <= 6) then energy := energy + 2.0 else if (7 <= bulgebases) and (bulgebases <= 20) then energy := energy + 3.0 else (* bulgebases > 20 *) energy := energy + 1.0 + 2.0*ln(bulgebases)/ln(10); bulging := false end else continuebulge (* the single match is to be ignored *) end; begin (* freeenergy *) (* test consistency of input *) if (x5start - x3start) <> (y5start - y3start) then begin writeln(output,' function freeenergy:'); writeln(output,' the supplied ends are not consistent with a helix.'); halt end; x5bound := x5start; x3bound := x3start; y5bound := y5start; y3bound := y3start; moveendsin(x5bound,x3bound,xpiece, y5bound,y3bound,ypiece); (* set external energy to zero *) energy := 0.0; (* no bulges yet *) bulging := false; if x5bound < x3bound then begin (* within this if, the range is bounded by base pairs *) x3 := x3bound; y5 := y5bound; x5 := pred(x3); y3 := succ(y5); (* these points (x,y/5,3) now define a dinucleotide pair between x and y. now we must scan the the dinucleotide across to the other bound. *) repeat (* convert positons to bases *) x5b := getbase(x5,xpiece); x3b := getbase(x3,xpiece); y5b := getbase(y5,ypiece); y3b := getbase(y3,ypiece); if (x3b = complement(y5b)) and (x5b = complement(y3b)) then readofftable (* everybody pairs *) else if (y5b = g) and (y3b = t) and (x3b = t) and (x5b = g) then gupair (* a special case *) else if (x5b <> complement(y3b)) and (x3b <> complement(y5b)) then continuebulge (* nobody pairs *) else if x3b = complement(y5b) then beginbulge (* the first two pair *) else if x5b = complement(y3b) then endbulge (* the last two pair *) else halt; x3 := pred(x3); y5 := succ(y5); x5 := pred(x5); y3 := succ(y3); (*if debugging then writeln(output,energy:4:1); *) until x5 < x5bound end; freeenergy := energy end; (* freeenergy *) (* end module freeenergy version = 'delmod 6.51 85 apr 17 tds/gds' *) (* begin module unlimitln *) procedure unlimitln(var afile: text); (* this procedure removes a stupid system dependent limit on the number of lines that one can write to a file. you may remove it from the code if your system does not want or need this. suggested method: place comments around the contents of the procedure. *) begin (* linelimit(afile, maxint); (@ set 'infinite' lines allowed for afile *) end; (* end module unlimitln version = 'delmod 6.51 85 apr 17 tds/gds' *) procedure initialize; (* initialize the variables of helix *) begin writeln(output,' helix ',version:4:2); brinit(xbook); brinit(ybook); unlimitln(hlist); rewrite(hlist); new(xseq); new(yseq); (* read parameters *) reset(helixp); if eof(helixp) then begin (* set defaults *) minlenmaxene:=defminlenmaxene; guallowed:=defguallowed; givefreeenergy := defgivefreeenergy end else begin readln(helixp,minlenmaxene); if eof(helixp) then begin guallowed:=defguallowed; givefreeenergy := defgivefreeenergy end else begin if helixp^ = 'g' then guallowed:=true else guallowed:=false; readln(helixp); if eof(helixp) then givefreeenergy := defgivefreeenergy else if helixp^ = 'e' then givefreeenergy := true else givefreeenergy := false end end; if (minlenmaxene < absminlength) and (minlenmaxene > 0) then begin minlenmaxene:=absminlength; (* force a reasonable limit *) writeln(output,' warning: the minimum length helix printed is ', trunc(minlenmaxene):1) end; (* put header in hlist *) writeln(hlist,' helix ',version:4:2); write(hlist,' x: '); copyaline(xbook,hlist); write(hlist,' y: '); copyaline(ybook,hlist); if minlenmaxene > 0 then writeln(hlist,' ',trunc(minlenmaxene):1,' base pairs', ' is the minimum length helix recorded') else writeln(hlist,' ',minlenmaxene:4:2, ' kcal is the maximum energy of a helix recorded'); write(hlist,' '); if not guallowed then write(hlist,'no '); writeln(hlist,'gu pairs allowed'); write(hlist,' '); if not givefreeenergy then write(hlist,'no '); writeln(hlist,'energies calculated'); writeln(hlist) end; (* initialize *) (* note: procedure gethelix in program matrix depends on the forms that nextsequences and hrecord produce. *) procedure nextsequences(x,y: pieceptr; xnumber, ynumber: integer); (* write the new sequence names to hlist *) begin writeln(hlist,' x piece: ',xnumber:1,' ',x^.key.hea.keynam.letters, ' y piece: ',ynumber:1,' ',y^.key.hea.keynam.letters); end; procedure hrecord(x5, y3: integer; xsequence, ysequence: pieceptr; length: integer); (* record the helix whose 5" end on the x sequence is x5 and whose 3" end on the y sequence is y3 that extends some length. note that it is easier to find the helix on a listing if only the 5" ends are listed on sequences. (it is painful to find if one has to remember whether it is 5" or 3" depending on whether it is x or y.) *) var x3, y5: integer; (* the other end of the helix *) deltag: real; (* the energy of the helix *) begin (* hrecord *) (*if debugging then writeln(hlist,'hrecord: ',length:1);*) y5 := y3 - length + 1; if givefreeenergy then begin x3 := x5 + length - 1; deltag := freeenergy(x5,x3,xsequence, y5,y3,ysequence) end else deltag := 0.0; if ((minlenmaxene > 0.0) and (length >= minlenmaxene)) or ((minlenmaxene <= 0.0) and (deltag <= minlenmaxene)) then begin write(hlist,' x5',prime,': ',inttopie(x5,xsequence):6, ' y5',prime,': ',inttopie(y5,ysequence):6, ' length: ',length:3); if givefreeenergy then write(hlist,' ', deltag:7:2, ' kcal'); writeln(hlist) end end; (* hrecord *) procedure discani(xmin,ymin: integer; (* minimum values *) var x,y: integer; (* current values *) xmax,ymax: integer; (* maximum values *) var x0: integer; (* the starting x for a diagonal *) var endofdiagonal, endofscan: boolean); (* this procedure initializes the variables for discans, which then steps (x,y) along a diagonal of a matrix. *) var xdone: boolean; (* true when x reaches the side *) ydone: boolean; (* true when y reaches the top *) begin x0:=xmin; x:=x0; y:=ymin; xdone := (x=xmin); ydone := (y=ymax); endofdiagonal := xdone or ydone; endofscan := (x = xmax) and ydone end; procedure discans(xmin,ymin: integer; (* minimum values *) var x,y: integer; (* current values *) xmax,ymax: integer; (* maximum values *) var x0: integer; (* the starting x for a diagonal *) var endofdiagonal, endofscan: boolean); (* this procedure makes it easy to scan a matrix on the diagonal. in the cartesian rectangle (xmin,ymin) to (xmax,ymax) diagonals of slope -1 are scanned, by decreasing x and increasing y. diagonals closest to the origin are scanned first. to use: call discani to initialize the variables. call discans to step (x,y) along the diagonal. flags: endofdiagonal is true when the top or side of the rectangle has just been reached. the next values of x and y will jump. endofscan indicates that the last position has been reached. *) var xdone: boolean; (* true when x reaches the side *) ydone: boolean; (* true when y reaches the top *) begin if endofdiagonal then begin if endofscan then halt; (* protection *) x0:=succ(x0); (* move to the next diagonal *) if x0 <= xmax then begin x:=x0; y:=ymin end else begin x:=xmax; y:=ymin + (x0-xmax) end end else begin x:=x-1; y:=y+1 end; xdone := (x=xmin); ydone := (y=ymax); endofdiagonal := xdone or ydone; endofscan := (x = xmax) and ydone (* ;if xdone then writeln(output,'xdone'); if ydone then writeln(output,'ydone'); if endofdiagonal then writeln(output,'endofdiagonal'); if endofscan then writeln(output,'endofscan'); writeln(output,'x,y',x:1,y:1); *) end; procedure scan(xseq,yseq: pieceptr; minlenmaxene: real; guallowed: boolean); (* scan two sequences xseq and yseq for complementarity. their lengths are xlength and ylength. the minimum length exact helix to record is minlenmaxene. gu base pairs are allowed if guallowed is true. the helix is recorded by a procedure external to this one: hrecord(x,y,xseq,yseq,length), where (x,y) is the 5" end of the helix on xseq and length is the number of bases matched. the sequences scanned are recorded by procedure newsequences, also, external. the last external procedure determines whether two bases can base pair. *) var xlength,ylength: integer; (* lengths of the sequences *) x,y: integer; (* a matching of xseq to yseq *) length: integer; (* current length of a helix *) x0: integer; (* used by discan *) eod,eos: boolean; (* used by discan *) procedure terminate(x,y: integer); (* stop the helix extension, possibly recording its end at (x,y) *) begin (* terminate *) (*if debugging then writeln(hlist,'terminate: length= ',length:1, *) (* 'absminlength = ',absminlength:1);*) if length >= absminlength then hrecord(x,y,xseq,yseq,length); length:=0; end; (* terminate *) procedure look; (* look at the pairing at (x,y) *) begin if basepair(xseq,yseq,x,y,guallowed) then begin (* extend the helix *) length:=succ(length); if eod then terminate(x,y) end else terminate(x+1,y-1) end; begin nextsequences(xseq,yseq,xnumber,ynumber); xlength:=piecelength(xseq); ylength:=piecelength(yseq); length:=0; discani(1,1,x,y,xlength,ylength,x0,eod,eos); look; while not eos do begin discans(1,1,x,y,xlength,ylength,x0,eod,eos); look end end; begin initialize; while not eof(xbook) do begin getpiece(xbook,xseq); xnumber:=number; if not eof(xbook) then begin reset(ybook); while not eof(ybook) do begin getpiece(ybook,yseq); ynumber:=number; if not eof(ybook) then begin scan(xseq,yseq,minlenmaxene,guallowed); clearpiece(yseq) end end; clearpiece(xseq) end end; 1: end. (* of helix *)