bash - awk count number of matches more or less than in column from two files -


I want to match $ 1 to one file from $ 1 and then calculate the number of those matches Which $ 2 (file1) & lt; $ 2 (file2) <3 $ (file1) and do this for each match

file 1 segment

  chromosomes Start End price chr1 0 121,347,754 -0.009727287106215954 chr1 144,009,053 249,250,621 .18180939555168152 ChR2 0 90,278,124 -0.0197499617934227 ChR2 95,387,134 243,199,373 -0.009399870410561562 chr3 0 91,000,000 -0.015508042648434639 chr3 93,541,117 198,022,430 0.011255052872002125 chr4 0 49,064,792 -0.02086501568555832 chr4 52,700,771 143,350,756 0.013872206211090088 chr4 143,350,756 191,154,276 -.004134085960686207 < / Code> 

file 2 check

  chromosome start end value array chr1 798,959 798,959 1.0 chr1 1,048,955 1,048,955 0.0 0 chr1 1,158,277 1,158,277 0.0 0 Chr1 1,314,015 1,314,015.5307189 226150513 0 chr1 1489928 1489928 .45127609372138977 0 chr1 1,499,298 1,499,298 1.0 chr1 1,948,400 1,948,400 0.0 0 chr1 2,021,114 2,021,114 0.0 0 chr1 2,056,735 2,056,735 0.0 0  

Therefore the output will be:

  $ 1 (match both file 1 and 2) $ 2 (file 1) $ 3 (file1) $ 4 (match number)  

Output < / Strong>

  Chromosome start-up probe chr1 0 121,347,754 238 chr1 144,009,053 249,250,621 590 ChR2 0 90,278,124 321  

I'm trying to do this and it not working!

This is as far as I have found

  awk 'FNR == NR {one [$ 1] = $ 1 FS $ 2; Next} {Print $ 1 [File 1] "\ t" $ 2 [File 1] "\ t" $ 3 [File 1] "\ t" $ 2 [File 1] & lt; $ 2 [file2] & lt; $ 3 [file1]} 'file1 file2  

Another way of awk Usage

  awk 'start {print "chromosome start over probe"} nr == fnr {a [$ 1] = one [$ 1] == "" $ 2: a [ $ 1] FS $ 2, next} {Delete for segment split (a [$ 1], b, fs) (i = 1; i & lt; = length (b); i ++) if (b ]> $ 2 & amp; B [i] & lt; $ 3) c [$ 1] ++ if (c [$ 1]) $ 1, $ 2, $ 3, c [$ 1 print] } 'Do 2 file1  

explanation

  • BEGIN {print "chromosom start and problem"} Print title
  • NR == FNR {a [$ 1] = one [$ 1] == ""? $ 2: A [$ 1] FS $ 2; Next} , read file2, extract ARA with the key of $ 1
  • partition (a [$ 1], b, fs), [ $ 1] ++
  • If (b [i] & gt; $ 2 & amp; amp; amp; amp; amp; amp;; बी] & lt; $ 3] array [$ 1] value [Code> This count

Comments