Skip to content

Conversation

@geoffgeoffgeoff3
Copy link

Hi Philip, I'll happily take feedback rather than a merge. I have done the work for only the --no-container case. I run with ...
bin/runexec --no-container --read-only-dir / --no-output-header --timestamp --add-eof --

@geoffgeoffgeoff3
Copy link
Author

Now the stdout and stderr work, interleaved, and I added "-" to allow output to stdout. But the --timelimit and --walltimelimit are not working.

@geoffgeoffgeoff3
Copy link
Author

The --timelimit and --walltimelimit are working. I was confused by the amount of WC time it takes to print a lot of tracing from my test program (recursive Fibonacci).

@geoffgeoffgeoff3
Copy link
Author

When you are happy with what I have done give me a push, and I'll work on the container version.

@PhilippWendler
Copy link
Member

In general this looks quite promising, nice, and thank you! We can still discuss naming a little bit.

I can also somewhat understand the request for configuring output files, though maybe you can also simply redirect the output?

Maybe we should not add the overhead of piping the tool output through BenchExec for everyone and disable the new code if no timestamp is requested? Should be easy.

When you are happy with what I have done give me a push, and I'll work on the container version.

Would be great!

I just remembered that we have a start of the GSoC attempt on this here: https://github.com/sosy-lab/benchexec/pull/1170/files It is a more complicated because it keeps stdout and stderr separate and thus needs to work on two streams in parallel. But actually I think your approach should also work, I don't see any disadvantage right now.

@geoffgeoffgeoff3
Copy link
Author

Yes, last night I also thought about disabling the code for timestamping. I'll do it this afternoon. I like the symmetry of having "-" to pipeline both stdin and stdout/stderr ... it's easy for users to remember when it's the same in both directions.

@geoffgeoffgeoff3
Copy link
Author

The container version has got me stumped for now. I might need you (Philip) to nudge me in the right direction.

@geoffgeoffgeoff3
Copy link
Author

I see the code ...

def _set_termination_reason(self, reason):
    if not self._termination_reason:
        self._termination_reason = reason

If I have a soft limit, typically that means the solver will catch the SIGTERM and continue. The code sets the termination reason with self.callback("cputime-soft"). If the solver hits the hard limit while continuing, and that really stops the solver, then the code tries to set the termination reason with self.callback("cputime"). However, as it was set previously the real new reason is not recorded and the output says "terminationreason=cputime-soft". I suggest taking out "if not self._termination_reason:". Whatcha think?

@PhilippWendler
Copy link
Member

It is deliberate that we record the first reason for termination that occurs, because this is the actual reason why the run was a failure. For example, if the process first reaches the soft memory limit and then the memory limit, it is a timeout and not an OOM.

@PhilippWendler
Copy link
Member

This PR is accumulating more and more independent features :-) I like them, but it is becoming a little bit hard to review the diff and keeping in mind which pieces of changes belong together. How much effort would it be to split off orthogonal stuff like soft wall-time limits and output redirections? If it is too much effort we can probably proceed with the current MR, but otherwise it would be helpful (and allow merging already finished features).

@PhilippWendler
Copy link
Member

The container version has got me stumped for now. I might need you (Philip) to nudge me in the right direction.

Finally I could really think about this. I have the hypothesis that either of the following would work:

  1. We implement the timestamping in the "child" process. That process currently does wait_for_child_and_forward_signals() while the process is running, so we could not do simple blocking I/O but would have to implement something that can handle I/O at the same time as forwarding signals. @larskotthoff reminded me of the possibility to use async I/O in Python and that this should make it easier to implement this solution, but I have no personal experience with it. A disadvantage of adding code to the "child" process is also that this process run inside the container and any problems in it are harder to debug (we do not get nice debug output).

  2. We send the required file descriptors (stdout/stderr of "grandchild") to "parent" via the existing sockets. It is possible to send file descriptors between processes with socket.send_fds(), though I have never done so.

  3. We pre-create pipes for stdout/stderr in the "parent" process, and let them being copied to the "child" process when it is created. Then we can use them there as parameters for Popen (so instead of letting subprocess create pipes and return them we supply our own). This should be relatively straight forward, we just need to make sure that we close all copies of these pipes in the appropriate places.

I would tend to say 3. looks like the most promising option. The place to create these new pipes would be in the same place here we already create pipes by calling os.pipe() in _start_execution_in_container.

@geoffgeoffgeoff3
Copy link
Author

It would be hard to separate the varo=ious things I have done:

  1. Time stamps and EOF for non-container
  2. Soft wall time limit
  3. IO and statistics redirection
    However, the hacking I have done in containerexecutor.py can be ignored - just replace it with an original containerexecutor.py. I'll do that and add it to the PR, then stop adding to that PR.

I will start a separate branch for the containerexecutor.py work.

@geoffgeoffgeoff3
Copy link
Author

Regarding the reason, how about ...

def _set_termination_reason(self, reason):
if not self._termination_reason:
self._termination_reason = reason
else:
self._termination_reason = self._termination_reason + " then " + reason

@geoffgeoffgeoff3
Copy link
Author

I have a cunning scheme for the container case, but my limited Python skills are holding me back. I think I'll need help from you or Marco (he's a Python wizard).

@geoffgeoffgeoff3
Copy link
Author

I marked the lines with change with ZZZZ, if that helps. I can remove those lines ATGB.

@geoffgeoffgeoff3
Copy link
Author

Strangely, the non-container "solver" runs slower than the container version.

@geoffgeoffgeoff3
Copy link
Author

You said we should discuss, at least, the option names. Already there was yuckkiness --timelimit --softtimelimit --walltimelimit, and my softwalltimelimit was a horrible attempt to be compatible. I know you have to keep the old ones for backwards compatibility, but they could be deprecated under these new options ...
--cpu-limit
--soft-cpu-limit
--wc-limit
--soft-wc-limit

--timestamp and --add-eof seem OK to me.

I shortened the stats file one to --statistics-file, which is fine for me.

Any others?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

2 participants