Skip to content

Commit 113157f

Browse files
author
committed
Deployed bcfdeac with MkDocs version: 1.6.1
1 parent 84bdca9 commit 113157f

File tree

2 files changed

+34
-6
lines changed

2 files changed

+34
-6
lines changed

api/core/problem.html

Lines changed: 33 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -2830,7 +2830,9 @@ <h3 id="astra_rl.core.problem.ValueFunctionProblem" class="doc doc-heading">
28302830
<span class="normal">289</span>
28312831
<span class="normal">290</span>
28322832
<span class="normal">291</span>
2833-
<span class="normal">292</span></pre></div></td><td class="code"><div><pre><span></span><code><span class="k">class</span><span class="w"> </span><span class="nc">ValueFunctionProblem</span><span class="p">(</span><span class="n">Problem</span><span class="p">[</span><span class="n">StateT</span><span class="p">,</span> <span class="n">ActionT</span><span class="p">],</span> <span class="n">ABC</span><span class="p">):</span>
2833+
<span class="normal">292</span>
2834+
<span class="normal">293</span>
2835+
<span class="normal">294</span></pre></div></td><td class="code"><div><pre><span></span><code><span class="k">class</span><span class="w"> </span><span class="nc">ValueFunctionProblem</span><span class="p">(</span><span class="n">Problem</span><span class="p">[</span><span class="n">StateT</span><span class="p">,</span> <span class="n">ActionT</span><span class="p">],</span> <span class="n">ABC</span><span class="p">):</span>
28342836
<span class="w"> </span><span class="sd">&quot;&quot;&quot;Extends `Problem` to be able to return sequence values with a value head.</span>
28352837

28362838
<span class="sd"> Note:</span>
@@ -2861,7 +2863,9 @@ <h3 id="astra_rl.core.problem.ValueFunctionProblem" class="doc doc-heading">
28612863
<span class="sd"> Returns:</span>
28622864
<span class="sd"> torch.Tensor[batch_size, max_continuation_length]: The per-token values of</span>
28632865
<span class="sd"> the given squence by the sequence predictor. Do not include the value of the input</span>
2864-
<span class="sd"> prefixes.</span>
2866+
<span class="sd"> prefixes. If you are predicting on the whole input, you should be slicing on</span>
2867+
<span class="sd"> `[:, :-1]`, meaning you should *not* return the value of the last token, whose</span>
2868+
<span class="sd"> input is eos/context length limit.</span>
28652869
<span class="sd"> &quot;&quot;&quot;</span>
28662870

28672871
<span class="k">pass</span>
@@ -2970,7 +2974,27 @@ <h4 id="astra_rl.core.problem.ValueFunctionProblem.value" class="doc doc-heading
29702974
</td>
29712975
<td>
29722976
<div class="doc-md-description">
2973-
<p>prefixes.</p>
2977+
<p>prefixes. If you are predicting on the whole input, you should be slicing on</p>
2978+
</div>
2979+
</td>
2980+
</tr>
2981+
<tr class="doc-section-item">
2982+
<td>
2983+
<code><span title="torch.Tensor">Tensor</span></code>
2984+
</td>
2985+
<td>
2986+
<div class="doc-md-description">
2987+
<p><code>[:, :-1]</code>, meaning you should <em>not</em> return the value of the last token, whose</p>
2988+
</div>
2989+
</td>
2990+
</tr>
2991+
<tr class="doc-section-item">
2992+
<td>
2993+
<code><span title="torch.Tensor">Tensor</span></code>
2994+
</td>
2995+
<td>
2996+
<div class="doc-md-description">
2997+
<p>input is eos/context length limit.</p>
29742998
</div>
29752999
</td>
29763000
</tr>
@@ -2999,7 +3023,9 @@ <h4 id="astra_rl.core.problem.ValueFunctionProblem.value" class="doc doc-heading
29993023
<span class="normal">289</span>
30003024
<span class="normal">290</span>
30013025
<span class="normal">291</span>
3002-
<span class="normal">292</span></pre></div></td><td class="code"><div><pre><span></span><code><span class="nd">@abstractmethod</span>
3026+
<span class="normal">292</span>
3027+
<span class="normal">293</span>
3028+
<span class="normal">294</span></pre></div></td><td class="code"><div><pre><span></span><code><span class="nd">@abstractmethod</span>
30033029
<span class="k">def</span><span class="w"> </span><span class="nf">value</span><span class="p">(</span>
30043030
<span class="bp">self</span><span class="p">,</span> <span class="n">context</span><span class="p">:</span> <span class="n">Sequence</span><span class="p">[</span><span class="n">StateT</span><span class="p">],</span> <span class="n">continuation</span><span class="p">:</span> <span class="n">Sequence</span><span class="p">[</span><span class="n">ActionT</span><span class="p">]</span>
30053031
<span class="p">)</span> <span class="o">-&gt;</span> <span class="n">torch</span><span class="o">.</span><span class="n">Tensor</span><span class="p">:</span>
@@ -3015,7 +3041,9 @@ <h4 id="astra_rl.core.problem.ValueFunctionProblem.value" class="doc doc-heading
30153041
<span class="sd"> Returns:</span>
30163042
<span class="sd"> torch.Tensor[batch_size, max_continuation_length]: The per-token values of</span>
30173043
<span class="sd"> the given squence by the sequence predictor. Do not include the value of the input</span>
3018-
<span class="sd"> prefixes.</span>
3044+
<span class="sd"> prefixes. If you are predicting on the whole input, you should be slicing on</span>
3045+
<span class="sd"> `[:, :-1]`, meaning you should *not* return the value of the last token, whose</span>
3046+
<span class="sd"> input is eos/context length limit.</span>
30193047
<span class="sd"> &quot;&quot;&quot;</span>
30203048

30213049
<span class="k">pass</span>

search/search_index.json

Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

0 commit comments

Comments
 (0)