SGID Silkworm Genome Informatics Database
Gene
KWMTBOMO01975  Validated by peptides from experiments
Pre Gene Modal
BGIBMGA014039
Annotation
PREDICTED:_collagen_alpha-1(IV)_chain_[Papilio_xuthus]
Full name
Collagen alpha-1(IV) chain      
Alternative Name
Collagen type IV alpha 1
Location in the cell
Nuclear   Reliability : 4.45
 

Sequence

CDS
ATGGTGGTCTACGTACTGTGGTTAGTAGCTGCTCTGGCACCTAAAATAATCGGAGTAAGCTCGCAAGATGATAGATATGCGGATTGGAATAACATATATTCCAGGGATTTGCCGCCGACTAACTGGCCTCGTGAAGTGGCGCCAGATACGGAGCGACCATACGCTGTAAATCAATATGGGTTGTATAATAGAAACGATATACCACCCCAATCAAGACCGGAACCAGAATTAGGGCAAAACTTCGCTGTGTATGATCCGGACACACGTCAAAGGACATCCACAGCTATTAATCGTAATTGTACAGCTCCAGGATGCTGTGTTCCTAAATGTTTCGCTGAGAAGGGTAACAGAGGATTTCCCGGCTCACCTGGACCACAAGGCCCACGTGGTCTTCCTGGCCATGAAGGTGCTGAAGGTGCGCAAGGACCTAAGGGGCAAAAAGGTCAAATAGGTCCACAGGGTCCACGTGGTCCTAAAGGTGACAAAGGCAAAACTGGAGCAAAAGGATTTGCTGGTAGAATGGGTATTCAAGGATCCCCTGGTGACCAAGGTAGACCGGGTATACCAGGAAGGGATGGTTGTAATGGAACTGATGGTGAACCTGGAATAGCTGGACCAAAAGGTTCTCAAGGACCGCGTGGTTACGCTGGATCCAAAGGAGATAAGGGAGATAAAGGTGAACCAGCTCACATTGGACGATATCCGAAAGGACAAAAAGGAGAACCTGGTGCAGATGGCATGCAAGGACAACCAGGTCTGGTTGGACCTCAAGGACCAGTTGGTTTGAAAGGACCAAAAGGCAGAGTTGGTCCTGCTGGCTTAACAGGACCTAAAGGAGATAAAGGTGCTAGAGGTGCAAAGGGCCACTCGATTCAAGGTGACAAAGGAGACAGAGGCGACAAAGGTGACCGAGGTCAAAGTTGTTCGCCAACTTCTTCCGTTGATTTCAATAATAAGGGAATTCACAAAAATATACAAGGAGATATGGGTGAAAAGGGTGACAAAGGAGAACCAGGTCGTATGGGACAAAAGGGTGATATTGGACCAATGGGGGAGCCTGGACTGTCAGGCCAGATGGGGATCAAAGGAGAAAAAGGCTTACGAGGAAATCCGGGTGAAAGGGGACGTGAAGGAATGTATGGCGCACCGGGGCCAATGGGTAATAAAGGAGAAAGAGGAAACGATGGATTATCTGGTTTAGCAGGAATACCTGGCCGAAAGGGTGAACCAGGTAGAGATGGAATTCCTGGACCTAGAGGCTTAAAAGGTGTTCCTGGTCCACCAGGGGGTCTTTCAGGTTCTCGGGGACCTCCTGGTCCCCCTGGTCCCAGAGGTTATGAGGGCCCACAAGGACCTAAAGGCACTGATGGAAGGCCGGGAGACAGAGGTCAAACAGGACCAATGGGCTCCCCAGGAAGTCAAGGTGAACCCGGAACACCAGGAATTGAGGGTCCTGCGGGACATAAAGGTGAAAAGGGAGAAGCTGGTTTTGATGGTCAAAAAGGAGAGTCTGGGCCTCGTGGATACGATGGAAATGTCGGACCCGTAGGACCTAGAGGAGAAAAAGGAGAAGATGGTCTATCCATAATTGGTCCAAAAGGAAATAGTGGCTTACCTGGTTTTTCCGGAGATAAGGGTCAAAAAGGAGAAAGAGGTTATACCGGTCTTAGAGGCGTACCAGGAAATTCCACCTTTGGTACACCTGGTATTCCAGGAGAAATAGGTCTTCCCGGAGAAAAGGGAGATAAAGGAATGCCTGGTTACGACGGTTTGCCTGGAAATACTGGACCCAAGGGTAACATAGGAGGTCGCTGCAATGAGTGCTCCCCCGGATCTTCCGGTCCAAAGGGAGACCGTGGCAACGATGGAGCAGTTGGTCCACGAGGCGAGAGGGGGCCAATAGGTCCAATTGGTTTTACTGGCGAACGAGGTGCTGATGGCATGCATGGCTTGCCTGGTGCACCTGGAGCACCTGGCGAACGGGGAGAAGATGGACCAACTGGTCCACCTGGAGATAGAGGAGCTGATGCCAACGTTCCGGTAAATTTAATTAAAGGTCCGAAGGGTGATAAAGGAAGTACAGGGCCCCGAGGTCCTCAGGGTCCTAAGGGTGAAATTGGTAGTGATGGTCCTAAAGGAGACCGAGGACAAGTTGGTATGCCAGGCCCAAAAGGAGACCGAGGCTATGAAGGACAGCCTGGTGTTGATGGTATACCTGGTGCAGATGGAATTCCTGGTATACCTGGAATAAAAGGAATATCTATAAAAGGAGACAAGGGCTTACCAGGTGATAGAGGGTTCAAAGGTGACAAAGGTAGTCCGGGTGATAGAGGTATAAAAGGCCAAGCAGGCCAATGTCCTGCTGATGTAAAAGAACTTACAAGAGGCGACAGAGGCGACAGAGGTGATACTGGCCCTCCAGGGCCTGCAGGTGAACCAGGAGAAAAGGGAGATAAAGGTTACATTGGATTGCAAGGTCAAAAAGGGGATATTGGTCCACCTGGTAGGCAGGGTCCTGTGGGAGCACGTGGGTTCCCAGGAATTCGAGGTGAAAAAGGAGAACTCGGTTCAAAAGGTTTTCCAGGAACTCCTGGTGAGAATGGGCCGCGAGGCTACCCCGGAAAACCAGGTTTTAAAGGAGATAAAGGAGAAGTCGGACCTTCCATCGTTGGTCCACCAGGCTTACCTGGTATACCAGGTCAAAAAGGTGACTCTGGTCTTCGAGGATTGCCAGGTATACCCGGTGATGATGGTCCGCCAGGTCCTCAAGGCTTACATGGAGAAAAGGGAGATCAAGGATTGGTGGGTAGACCTGGATTACCTGGTCAACCAGGTCAAAAAGGTGATGCAGGACCCGTGGGGCCCGCTGGGGTCCCAGGTATTCCTGGATTACCCGGTAAGGTTGGAGCAAAAGGACAACAAGGTTTCCCTGGAGAACCTGGTAGGCCAGGTGTTATTGGTTTGCCTGGTCAGAAAGGTGATATGGGAATCCAAGGGCCAGATGGCCAAAAAGGTTTTCCGGGTTCTCGAGGACGTCCTGGACCTCCAGGTCATCCGGGCGCACTAGGCTCACAGGGAGAAAAGGGTGATAAAGGAGAATTAGGGTATCCAGGTTCACCCGGTTCTCCAGGCCAGTCTGGACGTCCAGGTCCTGTCGGTCCTCAAGGACCTAAAGGAGATCAAGGATTTGAAGGCCCTCCAGGACTACCGGGATTACCTGGTCTATTGGGTCAAACTGGTGACAGAGGTTATACGGGCCCTAAAGGTGACAAAGGGGATGCTGGCTTAGCAGCTGAAAAGGGATCAAAAGGAGAACCTGGTCCACCTGGTCTACTTGGTATAGATGGTCTACCTGGTAGAGATGGAGAGAAAGGCGACACTGGTGAACCAGGAATACCTGGTCAAGGAATACCTGGGTATCCTGGCCAAAAAGGAGAAATGGGGATGCGTGGGTTTGATGGACTCTCGGGACCCATTGGAGAAAAAGGAAATCGTGGTCCTCAAGGAGTACCTGGTCTAAAGGGTAACTTGGGCATCTCAGGCGAGCCAGGTCGACCTGGAGTACCTGGTATCGATGGAGCACCTGGCCAGCCTGGTGATGTAGGCTTACCTGGTATTACTGGAGAAAAAGGAGATAAAGGTGAATTAGGTTTCCCCGGACGAGATGGATTAGGCGGGTTGAAAGGCGATCGTGGTCAACCTGGGGCAGAAGGTCCAATAGGACCAATAGGGTATCGAGGTCCTAAAGGAGACACGGGACTGCCAGGCGTATCCATAGATATTAAAGGTGACAAAGGAGAAGTGGGTCCAAGCGGTATTCCTGGTGAACCAGGTCAAAAGGGAGATCGAGGTCTCCCAGGATTACAAGGATTACAAGGTGAAAAAGGTGATCGTGGCTCACTAGGAGAAAAAGGAGACCAAGGTTTTACTGGGCGAATGGGAGAAAAAGGTGACACTGGTCCTATCGGTCCTACCGGCCTACCAGGCTTAACTATAAAAGGAGAAAAAGGCTTACCAGGAACTCATGGAAAACATGGGAGGCCAGGATTACCTGGTGCTCCAGGACAAAAAGGCGATCAAGGTCTACCAGGACTTCCGGGACAATTGGGTCGGCCTGGTATACCTGGGCTCCCAGGTGAAAAAGGACAAAAGGGTGATCAGGGTAATGAAGGTTTGGCAGGGCCACCAGGTCTTGTAGGTCCAACTGGACTACCTGGAACTCCTGGTATTACTGGGGAAAAAGGAGATCGAGGAGAGAAGGGGGCCACTGGGTTTGGTGCACCAGGGCAAAAAGGTGATCAAGGTCCCTCAGGTATACCAGGATTACCTGGTGAGAAAGGTGATAAAGGAGATCGTGGATTAGATGGATTACCTGGTCGAACCGGACCGATTGGTCCGCCAGGTCAGAAAGGTGATCGTGGCTATCCAGGAAGACCAGGCTTACAAGGTGAACAAGGGATGAAAGGTAATAAAGGCCAAGCGGCAGAACTCGTGTACGGCGCAAAAGGAGAACCAGGACCGCGTGGTTTGCCAGGAAATGATGGTCTACCAGGCGTAAACGGCGTTCCTGGTCGTCCAGGTAATAATGGACCCCCTGGTGAAAAGGGGGATAGAGGTTTTACTGGTGCTAGAGGTTTCCCGGGACCAAGAGGTCTCCCAGGTATACAGGGAATGGAAGGTGAAAGAGGAGAAATTGGTATGACAGGTCAAAGTGGACTTCCAGGAGCGCCTGGAGCACCATGCGTTAGTCAAGACTTCTTAACCGGAATTTTATTGGTACGACACAGTCAAAGAGAAGTCGTCCCACAATGTGAACCCAGTCACGTCAAATTATGGGATGGATATTCTTTATTGTACATAGATGGCAACGAAAAAGCCCATAATCAAGATTTAGGATATGCTGGTTCTTGTGTGAGAAAGTTCAGTACAATGCCATTCCTATTTTGTGACTTAAATGATGTCTGTAATTACGCAAGTCGTAACGACCGTAGTTATTGGTTATCCACTGGTCAACCAATACCAATGATGCCTGTTGAGGGACAAAATATTGCCAAATATATTTCAAGATGTGTGGTCTGTGAAGTACCGGCTAATGTGATTGCAGTACACAGTCAAACATTAGATATTCCGAGTTGCCCAGTTGGTTGGAATGAATTATGGATTGGTTACAGTTTTGTTATGCACACTGGTGCAGGAGGTCAAGGTGGAGGACAGGCTTTAGCAAGTCCTGGATCGTGCTTAGAAGATTTCCGTTCGATTCCATTTATAGAGTGTAATGGAGAAGGTGGAACTTGTCATCATTTTGCAAATAAACTTAGTTTCTGGCTAACTACTGTCGAAGATAGTCAACAGTTTAATGTACCGGAGCGCCAAACTCTAAAATCTGGTCAACTTCTTGAACGAGTCTCTCGATGTGCGGTTTGCATTAAGAATACGACATAG
Protein
MVVYVLWLVAALAPKIIGVSSQDDRYADWNNIYSRDLPPTNWPREVAPDTERPYAVNQYGLYNRNDIPPQSRPEPELGQNFAVYDPDTRQRTSTAINRNCTAPGCCVPKCFAEKGNRGFPGSPGPQGPRGLPGHEGAEGAQGPKGQKGQIGPQGPRGPKGDKGKTGAKGFAGRMGIQGSPGDQGRPGIPGRDGCNGTDGEPGIAGPKGSQGPRGYAGSKGDKGDKGEPAHIGRYPKGQKGEPGADGMQGQPGLVGPQGPVGLKGPKGRVGPAGLTGPKGDKGARGAKGHSIQGDKGDRGDKGDRGQSCSPTSSVDFNNKGIHKNIQGDMGEKGDKGEPGRMGQKGDIGPMGEPGLSGQMGIKGEKGLRGNPGERGREGMYGAPGPMGNKGERGNDGLSGLAGIPGRKGEPGRDGIPGPRGLKGVPGPPGGLSGSRGPPGPPGPRGYEGPQGPKGTDGRPGDRGQTGPMGSPGSQGEPGTPGIEGPAGHKGEKGEAGFDGQKGESGPRGYDGNVGPVGPRGEKGEDGLSIIGPKGNSGLPGFSGDKGQKGERGYTGLRGVPGNSTFGTPGIPGEIGLPGEKGDKGMPGYDGLPGNTGPKGNIGGRCNECSPGSSGPKGDRGNDGAVGPRGERGPIGPIGFTGERGADGMHGLPGAPGAPGERGEDGPTGPPGDRGADANVPVNLIKGPKGDKGSTGPRGPQGPKGEIGSDGPKGDRGQVGMPGPKGDRGYEGQPGVDGIPGADGIPGIPGIKGISIKGDKGLPGDRGFKGDKGSPGDRGIKGQAGQCPADVKELTRGDRGDRGDTGPPGPAGEPGEKGDKGYIGLQGQKGDIGPPGRQGPVGARGFPGIRGEKGELGSKGFPGTPGENGPRGYPGKPGFKGDKGEVGPSIVGPPGLPGIPGQKGDSGLRGLPGIPGDDGPPGPQGLHGEKGDQGLVGRPGLPGQPGQKGDAGPVGPAGVPGIPGLPGKVGAKGQQGFPGEPGRPGVIGLPGQKGDMGIQGPDGQKGFPGSRGRPGPPGHPGALGSQGEKGDKGELGYPGSPGSPGQSGRPGPVGPQGPKGDQGFEGPPGLPGLPGLLGQTGDRGYTGPKGDKGDAGLAAEKGSKGEPGPPGLLGIDGLPGRDGEKGDTGEPGIPGQGIPGYPGQKGEMGMRGFDGLSGPIGEKGNRGPQGVPGLKGNLGISGEPGRPGVPGIDGAPGQPGDVGLPGITGEKGDKGELGFPGRDGLGGLKGDRGQPGAEGPIGPIGYRGPKGDTGLPGVSIDIKGDKGEVGPSGIPGEPGQKGDRGLPGLQGLQGEKGDRGSLGEKGDQGFTGRMGEKGDTGPIGPTGLPGLTIKGEKGLPGTHGKHGRPGLPGAPGQKGDQGLPGLPGQLGRPGIPGLPGEKGQKGDQGNEGLAGPPGLVGPTGLPGTPGITGEKGDRGEKGATGFGAPGQKGDQGPSGIPGLPGEKGDKGDRGLDGLPGRTGPIGPPGQKGDRGYPGRPGLQGEQGMKGNKGQAAELVYGAKGEPGPRGLPGNDGLPGVNGVPGRPGNNGPPGEKGDRGFTGARGFPGPRGLPGIQGMEGERGEIGMTGQSGLPGAPGAPCVSQDFLTGILLVRHSQREVVPQCEPSHVKLWDGYSLLYIDGNEKAHNQDLGYAGSCVRKFSTMPFLFCDLNDVCNYASRNDRSYWLSTGQPIPMMPVEGQNIAKYISRCVVCEVPANVIAVHSQTLDIPSCPVGWNELWIGYSFVMHTGAGGQGGGQALASPGSCLEDFRSIPFIECNGEGGTCHHFANKLSFWLTTVEDSQQFNVPERQTLKSGQLLERVSRCAVCIKNTT

Summary

Subunit
Trimers of two alpha 1(IV) and one alpha 2(IV) chain. Type IV collagen forms a mesh-like network linked through intermolecular interactions between 7S domains and between NC1 domains.
Similarity
Belongs to the type IV collagen family.
Keywords
Basement membrane   Collagen   Complete proteome   Disulfide bond   Extracellular matrix   Glycoprotein   Hydroxylation   Reference proteome   Repeat   Secreted   Signal  
Feature
propeptide  N-terminal propeptide (7S domain)
chain  Collagen alpha-1(IV) chain
EMBL
BABH01041525    KQ459601    KPI93818.1    AGBW02013994    OWR42428.1    GAIX01005366    + More
JAA87194.1    ODYU01010379    SOQ55454.1    NWSH01000276    PCG77833.1    KZ150090    PZC73686.1    JTDY01004730    KOB67822.1    KQ460398    KPJ15135.1    GBYB01001322    JAG71089.1    GFDL01002239    JAV32806.1    GFDL01002241    JAV32804.1    GFDF01007696    JAV06388.1    GFDF01007695    JAV06389.1    ABLF02028010    ABLF02028011    GFXV01004823    MBW16628.1    AJVK01003669    GAMC01019528    JAB87027.1    GANO01001294    JAB58577.1    GGFK01006350    MBW39671.1    KK852643    KDR19588.1    ADMH02001962    ETN60202.1    AXCM01007666    ATLV01025642    KE525396    KFB52541.1    NEVH01016289    PNF26332.1    UFQS01000618    UFQT01000618    SSX05466.1    SSX25825.1    CP012523    ALC38251.1    CH940649    EDW64003.2    CM000157    EDW88397.1    KRJ97759.1    J02727    M23704    M96575    AE014134    V00200    M28334    BT053743    ACK77661.1    CH916368    EDW03878.1    CH379061    EAL33016.2    CH933807    EDW11116.1    CH480820    EDW54322.1    CH954177    EDV57814.1    OUUW01000006    SPP81930.1    NNAY01001368    OXU24218.1    CM002910    KMY88215.1    KMY88216.1    KMY88217.1    CVRI01000038    CRK94057.1    SSX05469.1    SSX25828.1    JH431704    HACA01020289    CDW37650.1    GDHC01011943    GDHC01011608    JAQ06686.1    JAQ07021.1    GAMC01019529    JAB87026.1    KZ270010    OZC08330.1    U07224    AAC46611.1    KN726230    KIH69049.1   
Pfam
PF01413   C4        + More
PF14808   TMEM164
PF01391   Collagen
PF01529   DHHC
PF00155   Aminotran_1_2
Interpro
IPR008160   Collagen        + More
IPR016187   CTDL_fold       
IPR001442   Collagen_IV_NC       
IPR026508   TMEM164       
IPR036954   Collagen_IV_NC_sf       
IPR001594   Palmitoyltrfase_DHHC       
IPR003134   Hs1_Cortactin       
IPR004838   NHTrfase_class1_PyrdxlP-BS       
IPR015422   PyrdxlP-dep_Trfase_dom1       
IPR005958   TyrNic_aminoTrfase       
IPR015424   PyrdxlP-dep_Trfase       
IPR005957   Tyrosine_aminoTrfase       
IPR004839   Aminotransferase_I/II       
IPR015421   PyrdxlP-dep_Trfase_major       
SUPFAM
SSF56436   SSF56436        + More
SSF53383   SSF53383       
PDB
1T61     E-value=1.6246e-81,     Score=777

Ontologies

Topology

Subcellular location
  
SignalP
Position:   1 - 21,         Likelihood:  0.966088
 
 
Length:
1813
Number of predicted TMHs:
0
Exp number of AAs in TMHs:
0.552189999999998
Exp number, first 60 AAs:
0.55128
Total prob of N-in:
0.02914
outside
1  -  1813
 
 

Population Genetic Test Statistics

Pi
214.824443
Theta
179.234655
Tajima's D
0.837672
CLR
0.165168
CSRT
0.620768961551922
Interpretation
Uncertain
Peptides ×
Source Sequence Identity Evalue
26822097 EVAPDTERPYAVNQYGIYNR 96.00 1e-10
26280517 NDIEMKENEVIK 96.00 1e-10
28467696 NDIIFCQIVDEEKCEYGCACK 96.00 1e-10
28556443 SYWLSTGQPIPMMPVEGQNIAK 100.00 6e-09
28556443 MGIQGSPGDQGR 100.00 6e-09
28556443 LSFWLTTVEDSQQFNVPER 100.00 6e-09
26822097 GYDGNVGPVGPR 95.45 1e-08
26280517 SWSTEVYCPVPGIVPTAPSSR 95.45 1e-08
24402669 SYVEVYPIDESKPVMR 95.45 1e-08
28467696 SYVSGYTPSQADVQVFEQVGK 95.45 1e-08
21761556 SIYKPILEYQLAEDGGK 100.00 2e-07
28556443 GQPGAEGPIGPIGYR 100.00 2e-07
28556443 GYDGNVGPVGPR 100.00 2e-07
26822097 AHNQDIGYAGSCVR 100.00 3e-07
26280517 SINDWVEENTNNR 100.00 3e-07
28467696 SIPEIQPIFESIIK 100.00 3e-07
28556443 SGQLLERVSR 100.00 3e-07
28556443 SGQLLERVSR 100.00 3e-07
28556443 EVVPQCEPSHVK 100.00 0.001
28556443 YADWNNIYSR 100.00 0.001
Copyright@ 2018-2023    Any Comments and suggestions mail to:zhuzl@cqu.edu.cn   渝ICP备19006517号

渝公网安备 50010602502065号