PlantTFDB
PlantRegMap/PlantTFDB v5.0
Plant Transcription Factor Database
Transcription Factor Information
Basic Information | Signature Domain | Sequence | 
Basic Information? help Back to Top
TF ID GBG69643.1
Organism
Taxonomic ID
Taxonomic Lineage
cellular organisms; Eukaryota; Viridiplantae; Streptophyta; Streptophytina; Charophyceae; Charales; Characeae; Chara
Family Trihelix
Protein Properties Length: 2237aa    MW: 250014 Da    PI: 7.0488
Description Trihelix family protein
Gene Model
Gene Model ID Type Source Coding Sequence
GBG69643.1genomeNCBIView CDS
Signature Domain? help Back to Top
Signature Domain
No. Domain Score E-value Start End HMM Start HMM End
1trihelix39.11.9e-1216821757270
    trihelix    2 WtkqevlaLiearremeerlr.......rgklkkplWeevskkmrergferspkqCkekwenlnkrykkikegekk 70  
                  W+ + + aLi+a+r+ + +l+       r k ++ +W +v+ ++++ g+ r++ +C +kw+nl +++kk+ + ++ 
  GBG69643.1 1682 WSVDDIIALIRAKRDQDAHLQgmghayaRMKPREWKWLDVATRLKKVGVDREADRCGKKWDNLMQQFKKVHHFQGL 1757
                  99************777766643333336789999**********************************9887665 PP

Sequence ? help Back to Top
Protein Sequence    Length: 2237 aa     Download sequence    
MEEDHQRKSR GATEQITGAD GLAGTRSADV WRKHEDEKMV VVHLSGIDPC ITDQAPAAAV  60
KKTSTSITSM FFEIASVYSD RTAVIHPPVL KSRGVHLLNT GDEHLLKNGH IHLPRSSFRG  120
FVDSLESSDA KEKRTRAAAT TKNLPSSQTK IDNEQGQSSS RQLSEKCEKF EKSEKCEKSE  180
YCEDSGESEK CEEFEKCEET DQQMNEKETA YQFQPCETEG KVEYRERKAE CTEPKKVEYR  240
ERKVECIEPK KTEYTKRTVE IEYTKRKVEY TYKQLRRGVV NLAREIQRQV LKGREGESNE  300
GSGGGGGRGG GRGGMGERQA RGESDRGRGG GGGRRGGGRR GGGGGGGMGE KQASGCAAKN  360
GRCSMDEDSP LLRIGIFVDR SVEYIVGVLA SLYVGAAFVP FDTSWEDGGE RKQDCSPPIP  420
SHRPLGHVLA QQEDIHSLDA TAQNHKRALD DLNSRFQQLQ QTVAAPAANT SNMSDRLNAL  480
EIDMGTVKDG MQHQHTATQP LEQWICTAAA HPSSSEREST PKWDSQTIIL DSSKTDPVQW  540
FRKFELTLQL HYVSEHKNHA YLYSRLGGAC QAWLDNLLSK YGVIAADLHT KISWEDLKAA  600
WHKRFQVEPP EIKAMDKLLT FEQGTLPSVD WIAEYQRLTS VPDLQMGFKA IRHYFISRSC  660
PTLSNALTHV EDTLMTTAEL FDKATQIIIT NKEAKNLRSS AAGPSRDQHR PRVAVVAAAT  720
PFDQTSEARY DDNNAPLMYV RIQVGQASYN ALLDSGASRN FMSQSFMQRA GLGAQVRRKA  780
KPTAIKLADG KTQQLLNRYI EAVPVYFAPH ACEPVTFDIL DTDFDIILGM PWLASADHTV  840
NFHRRTLSVR NAFGAEVACT IPLPHRSIRC QVVTAKSFRA TCAYEQSEEI GLCFLRTVAV  900
ADASPTDLSL DPWVVRLLDE FADIFESPTG VVSDRPISHE IILEAGAVPP KGCIYRMSEE  960
ELEVLRAQLD DLLAKGWIRP SSSPYGAPVL FVWKKNKDLR LCIDYRKLNV QIVKNAGPLP  1020
RIDELLERLG GAKYFSKLDL KLGYHQISIR PNDRYKSAFK TWYGHFEWVV MPFGLTNAPT  1080
TFQAVMTNEF RAMLDRFVLV YLDDILIYSR SLEDHLGHLR RVLETLRRAK YKANRDKCEF  1140
VRQELEYLGH FVTPEGISPL SDKIQAIQEW PEPRNVTDVR SFLGLAGYYQ RFIKGYSKIV  1200
AHLNKLQCED RPFDFGEEAR ESFFALKAAL LFAEVLRIYD PLLPTRVTTD ASGYGIGAGL  1260
EQQDAVDWHP VEYFSKKVPI VHSIDDARKK ELLAFVHALK RWRHFLLGRS QFRWVTDNNP  1320
LVFYKTQDTV NSTIARWMAF IDQLDFFPDH IPGKSNRFAD AFSRRPDHCT AVYSTFEIDD  1380
DLRNSFIRGY QADPEFRDKY SNCSSPNPAP SHYRIQEGYL LVHTRGKDLL CVPSDSHLRT  1440
RLLGEFHDAP ATEHFGVNRT IGRLRERFCG RSAAVSRTVV VNPHPDDDGR EVTAVQRSPT  1500
SPAPLREASG NNKDPPRQQF RSPSVCRGAS ARPQWMQSPS PLSAGSSAGR RVGECRETAP  1560
AVADVGDAHD GREVWAEQRR LMRSVREESI TRGVQRLRVG EDGHDGEEAV GDAHDPDWND  1620
NGAEGGEDDA GYISRSKQAA AMGGRGGKTK SCGGNGRWGK RTAGKGSDAE GDVDGEGGRH  1680
FWSVDDIIAL IRAKRDQDAH LQGMGHAYAR MKPREWKWLD VATRLKKVGV DREADRCGKK  1740
WDNLMQQFKK VHHFQGLSGK QDFFQLSGKD RMSKGFNFNM DRAVYEEILG STAKNHTINL  1800
KNVADTGAQG GVRLPSASSA DPDGDGGAEH DDDDDDDGST KGSSQTTGGA DGFGKRKSTR  1860
QQTFEAMTEC MEKHGALMAS TVESNSKRHC SIAIRQCEAL EAEIEVQKKH YAALDEVDDL  1920
LFWNEREGFA IVKLIAEARG YLVAVARGEQ PPPIRRSIVL PHNSIPQHKI ADKSELNAAK  1980
ERALKVQGIA LRVIHGWVFK SQNRQRGYHA AYQYALNHAA TDIARAMWMG EDWRYCVSPM  2040
VVHHTLDMDM KLPLWFVGAD VEDMHEDDGL AAYQEASIQR LVGAFTSAVI IAEATDGGRV  2100
SHERLKTMVD AMRMMLAATV WLMRMAGDDH RAHYDAWVFV QLTAKPTLVA SMHRCFDARR  2160
HIMQAATVIT DKLASPPITL IDPPMYVPDW ASIGVKFSHD ATLSSPMEAK KVDWLGTGPS  2220
EDEDDGKGDE QGSGGGR
Nucleic Localization Signal ? help Back to Top
NLS
No. Start End Sequence
1328339RGGGGGRRGGGR
2330338GGGGRRGGG
3331339GGGGRRGGG
4332340GGGGRRGGG
5336344GGGGRRGGG
6337345GGGGRRGGG
Best hit in Arabidopsis thaliana ? help Back to Top
Hit ID E-value Description
AT2G35640.14e-07Trihelix family protein