Skip to content

Commit 29dffdc

Browse files
committed
nora: fix format
1 parent bee4cf9 commit 29dffdc

File tree

1 file changed

+4
-7
lines changed

1 file changed

+4
-7
lines changed

nora.html

Lines changed: 4 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -52,10 +52,7 @@ <h1 itemprop="headline" align="center">
5252
</h1>
5353
<br>
5454
<p style="line-height:1" align="center"><b>
55-
<font color="061E61">Chia-Yu Hung<sup>1</sup>, Qi Sun<sup>1</sup>, Pengfei Hong<sup>1</sup></font>
56-
</b></p>
57-
<p style="line-height:1" align="center"><b>
58-
<font color="061E61">Amir Zadeh<sup>2</sup>, Chuan Li<sup>2</sup></font>
55+
<font color="061E61">Chia-Yu Hung<sup>1</sup>, Qi Sun<sup>1</sup>, Pengfei Hong<sup>1</sup>, Amir Zadeh<sup>2</sup>, Chuan Li<sup>2</sup></font>
5956
</b></p>
6057
<p style="line-height:1" align="center"><b>
6158
<font color="061E61">U-Xuan Tan<sup>1</sup>, Navonil Majumder<sup>1</sup>, Soujanya Poria<sup>1</sup></font>
@@ -79,6 +76,8 @@ <h1 itemprop="headline" align="center">
7976
<p><a href="https://arxiv.org/abs/2504.19854">[Paper on ArXiv]</a>&nbsp;&nbsp;&nbsp;&nbsp;<a href="https://github.com/declare-lab/nora">[Code on GitHub]</a>&nbsp;&nbsp;&nbsp;&nbsp;<a href="https://huggingface.co/collections/declare-lab/nora-6811ba3e820ef362d9eca281">[Hugging Face]</a></p>
8077
</center>
8178
</div>
79+
<p align="center"><iframe width="560" height="315" src="https://www.youtube.com/embed/_6AsL7AAPzk?si=di4MXco-w73zlj1y" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen></iframe></p>
80+
<br>
8281
<h2 id="abstract">
8382
<font color="000093">Abstract</font>
8483
</h2>
@@ -100,11 +99,9 @@ <h2 id="abstract">
10099
<!-- <font color="061E61"> Despite training TANGO's LDM with 63x less data, it manages to produce superior sound quality to the baselines</font> -->
101100
<!-- </li> -->
102101

103-
<br>
104-
<p align="center"><iframe width="560" height="315" src="https://www.youtube.com/embed/_6AsL7AAPzk?si=di4MXco-w73zlj1y" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen></iframe></p>
105102
<br>
106103
<figure>
107-
<p align="center"><img src="../NORA.png" width="100%" class="center" /></p>
104+
<p align="center"><img src="../NORA.png" width="80%" class="center" /></p>
108105
<figcaption>
109106
<p style="text-align: justify">
110107
<font color="061E61"><b>Figure 1:</b> NORA, as depicted in this figure, has three major components: (i) image encoder, (ii) vision language model, and (iii) FAST+ action tokenizer. The image encoder encodes the current state of the environment. Subsequently, the VLM predicts the next action in order to accomplish the input goal, given the current state. Thereafter, FAST+ decodes the VLM output tokens into actionable robot tokens.</font>

0 commit comments

Comments
 (0)